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Much  research  has  been  devoted  to  developing  efficient  routing 
algorithms  for  data  networks,  particularly  those  referred  to  as  packet 
switching,  distributed  networks.  More  recent  developments  have  led  to 
algorithms  which  are  shown  to  produce  optimum  routes  with  non-looping 
characteristics.  Several  of  these  algorithms  have  been  applied  and  are 
currently  being  used  in  operational  networks.  They  are,  however, 
restricted  to  use  in  smal 1-to-medium  scale  networks  because  of  excessive 
overhead  which  impacts  both  circuit  bandwidth  and  nodal  storage.  This 
research  investigates  algorithms  which  function  relatively  independent 
of  storage  and  bandwidth  and  are  therefore  adaptable  to  any  size  network. 

The  primary  tool  for  demonstrating  efficient  algorithms  lies  with 
simulation.  The  importance  of  mathematical  techniques,  however,  cannot 
be  overlooked.  Therefore,  the  initial  phase  of  the  research  involves 
the  investigation  and  development  of  abstract  analytical  concepts  which 
provide  an  impetus  to  the  design  of  the  simulator.  The  approach 
employs  a heuristic  searching  mechanism  which  requires  that  a 
network  be  described  as  a graph  using  the  root-node-leaf  notation. 

The  level  of  the  tree  is  equivalent  to  the  known  delay  about  a network  at 
any  particular  node.  The  algorithm  searches  the  tree  down  each  leg, 
evaluating  the  path  from  each  leaf  to  the  destination  node  using  — 
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heuristic  information  to  determine  the  optimum  path.  This  approach  is 
combined  with  the  classical  decomposition-synthesis  network  evaluation 
technique  to  derive  a formula  for  delay.  Several  heuristic  measures 
applicable  to  this  formula  are  evaluated  by  the  simulator.  — - 

The  simulation  is  described  in  fairly  general  detail.  It  is 
designed  so  that  additional  heuristics  may  be  employed  as  they  are 
developed.  The  GASP- IV  simulation  language  was  selected  because  it 
is  FORTRAN  based  and  because  it  allows  a greater  understanding  of  the 
network  simulation  process.  Additionally,  its  output  is  thorough 
and  easily  interpreted. 

The  results  of  the  simulator  provide  a foundation  for  sound  conclu- 
sions about  the  operating  characteristics  of  the  evaluated  algorithms  on 
large  distributed  networks.  The  more  important  conclusions  are  the 
determination:  1)  of  an  optimum  search  depth,  2)  of  an  efficient 
heuristic  measure  of  delay,  and  3)  that  the  best  of  the  algorithms 
evaluated  performs  as  good  on  all  size  networks  as  existing  algorithms 
on  small  networks.  These  conclusions  are  based  on  measures  of  delay, 
queue  lengths,  and  utilization  of  networks  of  various  sizes.  Though 
much  research  remains  for  large  capacity  networks  of  many  nodes, 
statistical  data  generated  by  this  research  provide  the  basis  for  a 
relatively  simple  method  for  routing  messages  across  large  scale  networks 
of  the  future. 


ACKNOWLEDGEMENTS 


I am  deeply  indebted  to  my  chairman,  Dr.  Udo  Pooch,  for  his  advice 
and  patience  throughout  my  tenure  at  Texas  A&M  University.  Mis  uncanny, 
almost  unique,  ability  to  "see  through  the  haze"  resulted  in  many  ideas 
at  the  inception  and  throughout  the  progress  of  this  work.  He  has 
provided  support,  direction,  and  inspiration  to  complete  tasks  in  a 
professional  manner.  Also,  much  gratitude  is  due  to  Dr.  Glen  Williams, 
who  contributed  a tremendous  insight  to  the  complications  of  simulating 
large  networks,  and  to  Drs.  Rahul  Chattergy  and  Gary  Richardson  for 
their  support  during  the  documentation  phase. 

This  dissertation  would  be  incomplete  without  acknowledging  the 
technical  support  of  Jerry  Adkins  and  Terry  Humphreys  who,  with  their 
own  research  worries,  found  time  to  listen  and  assist  on  some  of  the 
more  complex  ideas.  To  Beth  Baker,  my  sincere  appreciation  for  her 
professional  typing  and  drafting  abilities  and  her  dedication  to  see  me 
through  the  final  product. 

Particular  recognition  is  due  to  n\y  wife,  Francis,  and  children, 

Ann  and  Kirt,  who  helped  and  assumed  added  responsibilities  in  my  stead, 
enabling  me  to  devote  my  full  time  and  effort  to  this  research  and 
dissertation. 


VI 


To  my  wife,  Francis,  who 
believes  in  the  iirmossible. 


TABLE  OF  CONTENTS 


vi  i 


Page 


ABSTRACT  iii 

ACKNOWLEDGEMENTS  v 

DEDICATION  vi 

TABLE  OF  CONTENTS  vi  i 

LIST  OF  FIGURES  x 

LIST  OF  TABLES  xii 

Chapter 

I.  INTRODUCTION  1 

1.  General  1 

1 . 1 User  Perspective  1 

1.1.1  User  Characterization  2 

1.1. 1.1  Description  of  Factors  2 

1.1. 1.2  User  Categories  3 

1.1.2  Design  Objectives  4 

1.1. 2.1  Reliability  4 

1.1. 2. 2 Transparency  5 

1.1. 2. 3 Economy  6 

1.1. 2. 4 Convenience  7 

1 . 1 . 2.5  Security  7 

1.1. 2. 6 Other  Considerations  7 

1.2  Research  Objective  and  Plan  8 

II.  LITERATURE  SURVEY  11 

2.  Overview  11 

2.1  Historical  Background  11 

2.2  Functional  Classifications  13 

2.2.1  Circuit  Switching  13 

2.2.2  Message  Switching  14 

2.2.3  Packet  Switching  15 

2.3  Topological  Classifications  20 

2.3.1  Centralized  Networks  21 

2.3.2  Decentralized  Networks  24 

2.3.3  Distributed  Networks  25 


L 


viii 


Page 

2.4  Routing  Classifications  27 

2.4.1  Deterministic  Algorithms  27 

2. 4. 1.1  Flooding  Techniques  29 

2. 4. 1.2  Fixed  Techniques  29 

2. 4. 1.3  Split  Traffic  Techniques  30 

2. 4. 1.4  Ideal  Observer  Techniques  30 

2.4.2  Stochastic  Algorithms  31 

2.4.2. 1 Random  Techniques  31 

2.4. 2. 2 Isolated  Techniques  32 

2. 4. 2. 3 Distributed  Techniques  32 

2.4.3  Flow  Control  Algorithms  33 

2.4. 3.1  Isarithmic  Techniques  34 

2. 4. 3. 2 Buffer  Storaqe  Allocation  Techniques  . 34 

2.4. 3. 3 Special  Route  Assignment  Techniques  ..  35 

2.5  Discussion  35 

III.  THE  USE  OF  QUEUING  THEORY  TO  MODEL  NETWORKS  40 

3.  Introduction  40 

3.1  Network  Characterization  42 

3.2  Model  Description  46 

3.3  Developing  a Cost  Function  50 

3.4  The  Minimum  Cost  Flow  Problem  53 

3.4.1  Determining  the  Path  of  Least  Cost  54 

3.4.2  Determining  an  Optimum  Traversal  54 

3. 4. 2.1  An  Optimal  Search  Algorithm  55 

3.4. 2. 2 The  Admissibility  of  A*  56 

3.4. 2. 3 The  Optimality  of  A*  58 

3.4.3  Optimum  Routing  as  a Function  of  A*  and  B*  60 

3.5  Summary  62 

IV.  THE  NETWORK  SIMULATION  MODEL  63 

4.  Introduction  63 

4.1  Properties  of  the  Simulator  67 

4.2  General  Program  Flow  73 

4.3  Description  of  Experimental  Models  77 

V.  ADAPTIVE  ROUTING  FOR  LARGE  DISTRIBUTED  NETWORKS  88 

5.  Introduction  88 

5.1  Determining  the  Optimum  Search  Depth  93 

5.2  Evaluating  Other  Coordinate  Address  Techniques  (CATs)  111 

5.3  Establishing  Confidence  in  the  Simulator  115 


n 


IX 


Page 

5.4  Comparative  Performance  Evaluation  119 

5.5  Results  of  Large  Scale  Network  Simulation  125 

VI.  CONCLUSIONS  AND  RECOMMENDATIONS  134 

6.  Conclusions  134 

6.1  Recommendations  136 

REFERENCES  137 

APPENDIX  A 143 

APPENDIX  R 148 

APPENDIX  C 165 

VITA  210 


X 


LIST  OF  FIGURES 


Figure  Page 

1.  Example  format  for  message  switching  network  16 

2.  Example  format  for  packet  switching  network  18 

3.  Example  of  packet  switching  network  19 

4.  (a)  Centralized  network,  (b)  Decentralized  network, 

(c)  Distributed  network  22 

5.  Centralized  network  with  concentrators/multiplexors  23 

6.  Queue  structure  for  node  j 45 

7.  State-transition-rate  diaqram  for  M/M/C  47 

8.  Tree  representation  of  equation  3.14  52 

9.  Comparison  of  execution  times  for  search  depth  2 65 

10.  General  system  flow  66 

11.  GASP  file  entity  structure  (exclusive  of  file  pointers)  ...  68 

12.  Results  of  update  interval  analysis  (load  factor  = 0.2)  ...  72 

13.  Simulation  performance  77 

14.  An  ill-defined  network  with  looping  83 

15.  Sixteen  node  network  85 

16.  Twenty-five  node  network  86 

17.  Thirty-six  node  network  87 

18.  Average  hops  per  packet,  16  node  network  94 

19.  Average  hops  per  packet,  25  node  network  95 

20.  Average  hops  per  packet,  36  node  network  96 

21.  Average  queue  length  for  16  node  network  97 

22.  Average  queue  length  for  25  node  network  98 

23.  Average  queue  length  for  36  node  network  99 


Page 

Average  delay  for  16  node  network  100 

Average  delay  for  25  node  network  101 

Average  delay  for  36  node  network  102 

Utilization  for  16  node  network  103 

Utilization  for  25  node  network  104 

Utilization  for  36  node  network  105 

Average  queue  lengths  for  CATn  112 

Average  delay  for  CATn  313 

Average  utilization  for  CATn  114 

95%  confidence  interval  for  queue  lengths  using  CAT3  116 

95%  confidence  interval  for  delay  using  CAT3  117 

95%  confidence  interval  for  utilization  using  CAT3  118 

Message  trapping  in  the  GMA  121 

Trapping  with  the  RMA  123 

Average  delay  comparison  126 

Average  hops  for  256  node  network  128 

Average  queue  length  for  256  node  network  129 

Average  delay  for  256  node  network  130 

Average  utilization  for  256  node  network  131 


xii 


LIST  OF  TABLES 


Table  Page 

I.  Classification  of  routing  algorithms 28 

II.  Properties  of  special  route  assignment  technique 36 

III.  Periodic  Update  Algorithm 120 

I v.  Global  Mapping  Algorithm 122 

V.  Regional  Mapping  Algorithm 124 

Bl-1.  Sixteen  node  network  max  hop/average  hop  data 149 

Bl-2.  Twenty-five  node  network  hop  data 150 

Bl-3.  Thirty-six  node  network  max  hop/average  hop  data 151 

Bl-4.  Sixteen  node  network  queue  length/observation  data 1 52 

Bl-5.  Twenty-five  node  network  queue  length/observation  data  ...  153 

Bl-6.  Thirty-six  node  network  queue  length/observation  data 154 

Bl-7.  Sixteen  node  network  average  delay  data 155 

Bl-8.  Twenty-five  node  network  average  delay  data 156 

Bl-9.  Thirty-six  node  network  average  delay  data 157 

Bl-10.  Sixteen  node  network  utilization  data 158 

Bl-11.  Twenty-five  node  network  utilization  data 159 

Bl-12.  Thirty-six  node  network  utilization  data 160 

Bl-13.  Supporting  data  for  three  algorithms  evaluated 161 

Bl-1 4.  Confidence  interval  data 162 

Bl-15.  Supporting  data  for  comparison  curves 163 

Bl-16.  256  node  network  data 164 

Cl  - 1 . Simulation  options 170 


CHAPTER  I 


INTRODUCTION 


1 . General 

Computer  networks  have  been  classified  as  either  a network  of  compu- 
ters or  a set  of  terminals  connected  to  one  or  more  computers  (17).  Most 
computer  networks  consist  of  hosts,  terminals,  nodes,  and  transmission 
links.  A node  generally  refers  to  a computer  whose  primary  function  is 
to  switch  data.  Computers  used  primarily  for  functions  separate  from 
that  of  switching  data  are  referred  to  as  hosts.  Some  designs  permit 
the  node  and  host  functions  to  be  performed  by  the  same  computer.  Ter- 
minals are  devices  which  interface  the  user  to  the  computer  or  computer 
network  and  transmission  links  join  this  collection  of  subnet  elements 
together  to  form  a network  (64).  The  transmission  links  and  nodes  along 
with  the  essential  control  software  make  up  the  communications  subnet 
(35),  usually  referred  to  as  the  data  network. 

1 . 1 User  Perspective 

Prior  to  a review  of  literature  on  data  communications  networks,  it 
is  useful  to  have  an  understanding  of: 

(a)  how  designers,  managers,  and  operators  categorize  data  network 
users,  and 

(b)  network  design  philosophy  in  meeting  basic  user  requirements. 


The  Communications  of  the  Association  for  Computing  Machinery  is 
used  as  a pattern  for  format  and  style. 


A number  of  factors  are  related  to  the  categorization  of  data  com- 
munications users.  The  description  of  these  factors  has  been  limited  to 
three  broad  categories  which  constitute  a majority  of  the  functions  for 
which  computer  networks  have  an  application. 

1.  Remote  User  Access : When  users  are  not  in  the  general  vicinity 
of  a computer  system  and  need  to  have  access  to  that  system,  they  are 
considered  to  be  remote  users.  Remote  access  is  made  available  by 
attaching  a terminal  to  some  communications  medium  which  is  then  inter- 
faced to  a computer. 

Depending  upon  individual  requirements,  the  user  may  interact  with 
the  computer  for  computational  power  or  for  access  to  data.  The  inter- 
active user  generally  expects  to  receive  a response  within  seconds  of 
having  generated  some  stimulus  which  would  normally  cause  a response. 
However,  the  response  to  remote  job  entry,  another  user  of  remote  ter- 
minals, is  dependent  on  the  time  to  process  the  entire  job  and  the  avail- 
ability of  the  communications  medium  which  is  generally  shared  with 
other  users. 

2.  Computer-to-Computer : Computers  communicating  directly  with 
computers  is  creating  the  greatest  demand  for  data  communications  media 
(5).  Given  the  proper  stimulus,  a computer  will  generate  bits  of  infor- 
mation for  transfer  at  a much  faster  rate  than  the  remote  user  described 
in  the  preceding  paragraph.  Portions  of  or  even  whole  data  bases  can  be 
conveniently  transferred  between  computers  without  manual  intervention. 
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3.  Message  Traffic:  Much  communication  traffic  is  generated  because 
some  user  desires  to  send  a message  to  another  user.  The  information 
transferred  user-to-user  fashion  is  referred  to  as  message  level  traffic. 

A second  type  of  message  level  traffic  involves  computer-to-user  messages. 
Given  a predetermined  sequence  of  events,  many  computers  are  programmed 
to  automatically  transmit  messages  to  their  (remote)  terminals.  The 
computer-to-user  communications  are  frequently  classified  as  message 
traffic  in  the  same  context  as  terminal-to-terminal  communications. 

1.1. 1.2  User  Categories 

Considering  the  above  factors,  description  of  users  is  logically 
divided  into  three  categories.  The  first  category  shall  be  called  the 
real-time  user.  A real-time  computer  system  may  be  defined  as  one  that 
receives  and  processes  an  input  and  returns  a result  with  sufficient 
speed  to  affect  the  function  at  the  terminal  within  an  environmentally 
defined  time  frame  (56).  The  user  of  a real-time  computer  system  is  a 
real-time  user. 

The  second  category  shall  be  referred  to  as  the  teleconference  user. 
These  are  users  that  desire  to  communicate  a complete  idea  or  concept 
as  the  communications  entity.  Messages  of  this  nature  are  usually  context 
sensitive  and  place  no  restrictions  on  the  order  in  which  symbols  may 
appear  in  the  message.  As  such,  they  are  not  easily  adaptable  for  com- 
municating with  a computer.  Therefore,  user-to-user  communications 
constitute  the  majority  of  message  traffic;  the  users  of  these  terminals 
are  in  a teleconference  mode  of  communications. 
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The  third  and  final  category  of  user  is  referred  to  as  the  data- 
sharing  user.  The  user  in  this  context  could  be  a computer  (or  the  owner 
of  the  computer)  which  needs  to  share  its  data  with  other  computers,  or 
conversely,  cause  data  to  be  shared  with  it.  The  data  to  be  shared 
(communicated)  could  be  as  small  as  a few  bits  or  as  large  as  an  entire 
data  base.  The  data-sharing  user  will  be  a significant  factor  in  the 
design  of  future  computer  networks. 

1.1.2  Design  Objectives 

As  with  the  design  of  any  unit  or  system  which  depends  on  its  users 
for  operational  funding,  computer  network  designers  must  balance  demand 
against  capability  (40).  However,  certain  minimum  requirements  exist. 

In  general,  a computer  network  must  be  able  to  provide  reliable,  error- 
free  communications  within  a reasonable  time  frame  as  defined  by  the 
user.  The  designer  interprets  these  general  objectives  in  the  following 
terms : 

a)  Reliability  (uninterruptable,  error-free  service) 

b)  Transparency  (network  operation  should  be  invisible  to  the  user) 

c)  Economy  (minimum  overhead  and  efficient  use  of  media) 

d)  Convenience  (user  access  methodology) 

e)  Security  (as  required  by  the  user). 

1.1. 2.1  Reliability 

Reliability  is  an  inherent  characteristic  of  any  design  effort. 

Reliability  in  a computer  network  refers  to  its  ability  to  provide  unin- 
terruptable, error-free  service.  Uninterruptable  service  is  greatly 
dependent  upon  a design  philosophy  which  addresses  the  question,  "To  what 
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extent  should  alternate  transmission  paths  and  backup  equipment  be 
provided?"  The  answer  to  this  question  generally  requires  a statistical 
analysis  of  cost  versus  hardware  reliabilities  and  a queuing  analysis  of 
load  factors  generated  by  potential  users.  Load  factor  analysis  has 
played  a major  role  in  the  design  of  new  computer  switching  networks 
discussed  later. 

1 .1  .2.2  Transparency 

Transparency  is  one  of  the  most  important  design  features  of  any 
csninuni cat ions  network.  Because  of  technological  constraints,  most 
networks  in  existence  do  not  possess  total  transparency;  that  is,  there 
are  certain  values  (bit  configurations)  which  cannot  appear  in  the  text 
of  a message  because  of  automatic  hardware  control  actions  which  will 
take  place.  For  that  purpose,  the  American  National  Standards  Institute 
(ANSI)  has  set  aside,  as  a standard,  certain  bit  values  to  be  used  exclu- 
sively for  hardware  control.  Since  they  are  values  for  which  no  other 
use  has  been  designated,  one  could  conclude  that  the  transparency  problem 
has  been  circumvented.  Such  is  not  the  case,  however,  since  certain  types 
of  user  schemes,  i.e.,  facsimile,  graphics,  and  raw  satellite  weather  and 
photographic  data  will  generate  fields  of  unknown  values  with  a high 
probability  that  designated  control  values  will  be  contained  somewhere  in 
the  text.  Transmitting  this  type  of  information  over  computer  networks 
which  are  not  totally  transparent  will  cause  unpredictable  and  probably 
unsuccessful  results. 

Technology  has  only  recently  advanced  so  that  total  transparency  is 
a realistic  design  goal.  The  problem  remains,  however,  to  have  a scheme 
standardized  so  that  user  computers  of  many  types  may  communicate  without 
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major  interface  redesign.  An  ANSI  group  has  been  established  for  that 
purpose  and  a standard  has  been  proposed  (78)  for  review  by  the  appro- 
priate committee  members. 

1.1. 2. 3 Economy 

Expediency  of  communications  is  an  ever  increasing  demand.  Economic 
considerations  have  seriously  constrained  any  significant  widescale 
improvements  in  data  transmission  speeds  in  the  near  future.  The  problem 
arises  because  the  primary  communications  medium,  landlines,  was  designed 
for  voice  communications  at  bandwidths  less  than  that  required  for  wide- 
band data  transmission.  The  upgrade  of  existing  landline  networks  or  the 
installation  of  a new  medium  capable  of  communicating  wideband  data  would 
involve  astronomically  high  costs  to  be  responsive  to  the  current  demand 
rate  of  change.  Except  for  backbone  circuits,  economic  considerations 
will  inhibit  the  widespread  use  of  the  high  speed  data  rates  through  the 
life  span  of  the  existing  landline  systems. 

A computer  network  should  function  with  minimum  overhead.  Since,  in 
general,  computer  power  exceeds  that  necessary  to  keep  links  saturated, 
overhead  is  essentially  a discussion  of  link  loads.  With  less  overhead 
on  a link,  more  efficient  utilization  can  be  made  of  available  computer 
power  with  a corresponding  increase  in  effective  transmission  rate.  What 
is  link  overhead?  It  is  that  portion  of  a message  exclusive  of  text 
required  to  communicate  between  computers.  A certain  amount  of  control 
information  must  be  attached  to  each  message  for  the  receiving  computer 
to  interrogate  in  order  to  determine  the  proper  course  of  action  for  the 
message.  The  efficient  design  of  a scheme  for  transmitting  control  infor- 
mation will  tend  to  minimize  overhead  without  restricting  flexibility. 
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This  design  problem  is  compounded  by  differing  media  and  computer  inter- 
face characteristics.  The  complexity  of  accounting  for  these  divergent 
factors  makes  it  no  simple  task  to  design  a "best"  scheme  for  minimizing 
overhead . 

1 . 1 .2 .4  Convenience 

Though  not  so  readily  apparent,  a critical  design  factor  is  user 
convenience.  If  accessing  a specific  type  of  computer  network  is  trouble- 
some, that  network  is  competitively  placed  at  a disadvantage  to  other 
types  of  networks.  The  reputation  of  a network,  regardless  of  its  other 
features,  can  be  quickly  tainted  through  the  only  physical  interface  to 
the  user.  The  designer  must  attempt  to  keep  access  procedures  simple  in 
order  to  retain  customers. 

1 .1 .2.5  Security 

Data  transferred  through  a computer  network  should  be  securable  if 
so  desired  by  the  user.  Current  computer  technology  constrains  the  amount 
of  security  the  user  can  expect  from  the  communications  system.  Since 
there  are  no  perfectly  secure  computer  systems,  the  network  design  should 
consider  various  computer  design  features  which  tend  to  enhance  security. 

Some  computers  are  more  securable  than  others  and  should  be  appropriately 
weighted  for  that  purpose.  A combination  of  sophisticated  hardware  and 
simple  software  increases  the  chance  of  providing  a secure  data  communica- 
tions media. 

1.1. 2. 6 Other  Considerations 

The  management  of  a geographically  dispersed  system  which  potentially 
interfaces  with  thousands  of  users  possessing  equipment  from  hundreds  of 
different  vendors  each  with  its  own  interface  characteristics,  poses  un- 
usual problems.  Under  current  guidelines  it  is  improbable  that  any  one 
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organization  could  collect  the  necessary  talent  to  cope  with  engineering 
and  "finger-pointing"  problems  that  are  inevitable  in  a multi  vendor 
environment  without  pricing  the  service  out  o*  roach.  In  reviewing  the 
literature  one  gains  an  intuitive  feeiing  that  a common  technological 
base  is  necessary  between  users  and  vendors  for  the  independent  success 
of  a computer  network  system.  Such  a base,  in  the  form  of  internationally 
established  computer  interface  standards,  is  slowly  evolving  through  a 
group  of  user,  government,  and  vendor  representatives  called  the  Inter- 
national Standards  Organization  (ISO)  (3).  The  required  standards  are 
expected  to  be  published  within  the  next  two  year  time  frame.  Though 
other  management  problems  must  be  overcome,  none  possess  the  novelty  or 
approach  the  complexity  of  that  described  above. 

There  are  certain  social  implications  of  linking  data  bases  which 
contain  information  of  one  form  or  another  on  every  U.S.  citizen.  Regu- 
latory bodies  have  already  indicated  their  awareness  of  the  problem  by 
invoking  a privacy  act  which  restricts  release  of  private  personal  infor- 
mation by  federal  agencies  (77).  As  a result,  security  is  now  being 
given  increased  emphasis  in  nongovernment  systems  (security  has  always 
been  a primary  design  factor  in  government  systems).  As  previously 
stated,  security  has  certain  technological  constraints  which,  at  least 
for  the  near  future,  will  inhibit  the  use  of  general  computer  networks 
for  extremely  sensitive  communi cat  ions. 

1.2  Research  Objective  and  Plan 

This  dissertation  is  concerned  with  the  development  of  data  communi- 
cations network  routing  algorithms  which  are  independent  of  network  size. 
The  algorithms  should  possess  many  of  the  characteristics  of  previously 
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developed  algorithms  which  allow  dynamic  rerouting  ot  messages  for  varying 
system  loads  and  minimum  time  for  network  traversal.  But,  contrary  to 
the  previous  developments  which  possess  these  characteristics,  procedures 
developed  herein  should  not  require  the  large  storage  arrays  for  deter- 
mining the  minimum  time  path  to  other  nodes  or  subnets  in  the  network  nor 
should  tney  require  the  overhead  for  preventing  looping  of  messages  within 
the  network.  The  purpose  of  this  research,  then,  is  to  develop  a routing 
algorithm  which  can  be  used  for  any  size  network  while  retaining  as  many 
of  the  desirable  properties  of  existing  dynamic  routing  algorithms  as 
possible. 

In  Chapter  II  a survey  of  the  literature  related  to  the  routing 
problem  is  provided.  Detailed  concepts  and  terminology  are  presented  as 
support  material  to  succeeding  chapters. 

Chapter  III  is  used  to  describe  a broadly  defined  analytical  model 
of  investigated  algorithms.  Many  aspects  of  the  analytical  model 
necessarily  remain  abstract  to  allow  flexibility  for  investigating  the 
various  alternatives  encountered.  Too,  the  analytical  model  is  incomplete 
since  mathematical  techniques  of  sufficient  power  to  completely  describe 
such  networks  have  not  yet  been  developed.  As  a result  of  mathematical 
deficiencies  much  development  has  occurred  through  simulation.  A des- 
cription of  the  simulator  developed  in  support  of  the  research  is 
presented  in  Chapter  IV. 

An  algorithm  which  is  adaptable  to  any  size  network  is  examined  in 
Chapter  V.  A detailed  analysis  is  provided  through  the  use  of  data 
obtained  from  the  simulation  phase. 


Finally,  a summary  of  the  results  of  the  research,  conclusions,  and 
proposed  recommendations  and  suggestions  for  further  research  are  pre- 
sented in  Chapter  VI.  Applicable  queuing  theory,  simulation  data,  and  a 
program  listing  of  the  simulator  model  are  contained  in  Appendices  A,  B, 
and  C,  respectively. 
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CHAPTER  II 

LITERATURE  SURVEY 

2.  Overview 

The  complexity  of  computer  networks  has  taken  a dramatic  upswing 
along  with  the  more  significant  developments  in  electronic  technology 
such  as  medium  and  large  scale  integrated  circuitry  and  microprocessors 
(53).  Along  with  this  upswing  in  complexity,  there  have  evolved  several 
sophisticated  methods  of  classifying  networks  based  upon  message  routing 
schemes,  topology,  network  functional  aspects,  and  combinations  of  these. 

A detailed  discussion  of  these  classifications  is  provided  after  a brief 
review  of  history  on  data  networks. 

2.1  Historical  Background 

Initial  computer  installations,  in  the  late  1 940 ' s and  early  1950's, 
were  either  dedicated  to  a particular  research  problem  or  used  for  finance 
accounting.  Since  that  time,  the  explosive  growth  of  computer  technology, 
both  hardware  and  software,  has  placed  the  computer  into  a wide  spectrum 
of  applications.  Individual  installations,  however,  were  prone  to 
develop  sophisticated  software  not  readily  transportable  to  the  wide 
diversity  of  machines  and  interfacing  software.  As  a result,  users 
frequently  were  required  to  develop  duplicate  software  packages  which 
could  be  economically  adapted  to  their  own  installation. 

During  the  same  time  frame,  military  planners  were  becoming  increas- 
ingly concerned  with  providing  survivable,  low-delay  communications  to 
support  advanced  weapon  systems.  High  speed  digital  computers  were 
employed  to  meet  the  critical  response  times  for  air  defense  communications . 

I 
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A computerized  air  defense  system  called  SAGE  (72),  installed  in  1958, 
was  the  first  attempt  to  interconnect  computers  on  a large  scale.  Later, 
in  the  early  1960's,  the  Automated  Digital  Information  Network  (AUTODIN) 
was  installed  to  provide  the  Department  of  Defense  with  quasi -survivabl e 
data  communications  (4).  Development  of  AUTODIN  provided  the  initial 
impetus  toward  the  design  of  message  routing  schemes  (60,70). 

The  experience  of  the  military  planners  and  supporting  contractors 
led  to  development  of  the  early  systems  relying  heavily  on  common  carriers 
rather  than  dedicated  lines  (CYBERNET  (54)  and  DATRAN  (8,26,34)).  Store 
and  forward  networks  first  began  to  appear  in  1969  as  a cost  effective, 
common  carrier  approach  in  response  to  the  popularity  of  the  many  time- 
sharing systems  on  the  market  at  that  time  (34).  Though  time-sharing 
systems  allowed  many  users  to  simultaneously  interact  with  a machine, 
usage  was  not  uniform  throughout  the  day  (51,57). 

The  nature  of  scheduling  one's  activities  led  to  relatively  short 
periods  of  peak  usage  during  which  machines  became  strained  and  even 
saturated.  Administrative  procedures  for  scheduling  terminal  usage 
produced  better  utilization  of  the  machine  during  slack  periods  but  did 
not  prevent  periodic  peaks  from  saturating  the  system.  The  advancement 
of  network  technology  was  encouraged  as  a solution  to  the  problem  of 
program  transportability  and  peak  machine  loadings. 

In  1967,  Roberts  (71)  proposed  a method  for  better  utilization  of 
resources  through  load-leveling,  elimination  of  functional  duplication, 
and  specialization  of  hardware  and  software.  The  Advanced  Research 


Projects  Agency  (ARPA),  largely  as  a result  of  Roberts'  research,  proposed 
a concept  building  a distributed  switchpoint  system  having  non-homo- 
genous  hosts  and  lines.  Such  a network,  according  to  ARPA,  would  have 


specialized  capabilities  throughout  the,  system  to  meet  unique  require- 
ments of  each  user.  The  research  efforts  through  ARPA  have  been  the 
impetus  for  the  rapid  growth  of  computer  networks  during  the  1970's. 
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2.2  Functional  Classifications 

There  are  basically  three  technologies  for  interconnecting  computers, 
each  with  its  own  distinguishing  characteristics.  "These  technologies 
are  distinguished  by  the  manner  in  which  resources  are  allocated  in 
support  of  communications  ..."  (43)  and  therefore  take  on  the  view  of  the 
designer  who  must  make  maximum  utilization  of  the  resources.  The  techno- 
logies are: 

a)  circuit  switching, 

b)  message  switching,  and 

c)  packet  switching. 

A conceptual  description  of  each  is  contained  in  the  following  para- 
graphs along  with  some  comparisons  and  applications.  The  reader  desiring 
more  detail  is  referred  to  Kimbleton  and  Schneider  (43)  and  the  many 
references  contained  therein. 

2.2.1  Circuit  Switching  (15,16,21,28,38,43,57) 

Circuit  switching  is  analogous  to  the  telephone  (voice)  switching 
networks  where  a complete  circuit  or  route  is  established  prior  to  the 
start  of  communication  by  the  users  (12).  It  comes  in  two  forms,  manual 
or  automatic,  both  involving  the  exclusive  dedicated  use  of  circuits. 

The  manual  switching  of  circuits  is  used  mostly  with  remote  terminals, 
generally  for  interactive  type  communications.  In  this  mode,  the  user 
dials  the  proper  telephone  circuits  for  access  to  the  desired  computer 
system.  If  the  path  is  unacceptable  or  if  access  to  another  computer 
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is  desired,  the  user  terminates  the  existing  connection  and  redials 
(switches)  to  another  circuit.  Automatic  circuit  switching  systems,  on 
the  other  hand,  require  the  use  of  electronic  switching  mechanisms  which 
automatically  connect  the  required  circuit  when  pulsed  with  the  proper 
sequence  of  bits.  Both  modes  of  circuit  switching  experience  line 
contention  delay  when  distant-end  user  circuits  are  busy. 

Though  widely  used  for  individual  remote  terminal  access,  circuit 
switching  has  not  been  considered  as  having  significant  network  potential, 
both  in  terms  of  efficiency  and  economic  practicality.  However,  the  not- 
to-distant  future  holds  some  evolutionary  steps  in  switching  speeds  of 
solid  state  devices  which  will  justify  a reevaluation  of  circuit  switch- 
ing as  a viable  alternative  in  network  design  (43).  Networks  which  use 
some  form  of  circuit  switching  are  DATRAN  (57)  and  TYMNF.T  (76). 

2.2.2  Message  Switching  (12,21,38,45,63,67) 

A message  is  defined  to  be  a logical  unit  of  information  from  the 
viewpoint  of  the  user  (43).  Thus,  telegrams,  programs,  and  data  files 
are  examples  of  messages.  A message  switching  subnet  may  be  regarded  as 
a collection  of  physical  circuits  interconnected  by  computer  switches 
which  have  the  ability  to  interrogate  message  control  fields  for  deter- 
mining subsequent  action.  It  differs  from  circuit  switching  in  that 
circuits  are  no  longer  dedicated  for  exclusive  use.  Message  switching 
is,  therefore,  considered  to  be  less  expensive  since  circuit  costs  are 
amortized  among  the  users  sharing  the  system.  However,  any  significant 
increase  in  switching  speeds  of  evolving  technology  could  cause  circuit 
switching  to  have  a cost  advantage.  AUTODIN  is  a prime  example  of  a 
message  switching  network  (4). 
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Messages  within  a message  switching  network  are  communicated  between 
switches  on  a message-by-message  basis.  This  means  that  a message, 
broken  into  blocks  of  data,  must  have  been  either  transmitted  and  received 
in  its  entirety  or  canceled  across  a link  before  the  next  message  can  be 
transmitted.  Each  block  of  the  message  must  be  transmitted  in  its  proper 
sequence  so  the  receiving  switch  can  rebuild  the  message  and  verify  to 
the  sending  switch  that  it  has  taken  accountability  for  the  message 
(Figure  1).  (Note  that  only  the  initial  block  has  sufficient  control 
information  for  further  routing.)  If  the  message  is  not  accepted,  it 
must  be  retransmitted  until  accountability  can  be  verified  by  the 
receiving  switch.  Direct  access  storage  devices  (DASD)  are  employed  to 
prevent  unreasonable  restrictions  on  message  lengths  and  to  store 
messages  in  cases  of  circuit  failure  or  heavy  loading.  Because  of  these 
characteristics,  message  switching  is  frequently  referred  to  as  store- 
and-forward  message  communications.  Generally,  real-time  statistics 
are  not  kept  and,  thus,  dynamic  load  balancing  is  made  impractical. 

For  that  reason,  heavy  loading  on  any  one  specific  circuit  is  not 
normally  sufficient  criteria  to  seek  alternate  traffic  routing.  Contrary 
to  the  operation  of  most  packet  switching  networks  (described  below), 
alternate  routing  in  message  switching  networks  is  reserved  almost 
entirely  for  circuit  or  switch  failure. 

2.2.3  Packet  Switching  (25,28,39,43,58) 

A new  technique  for  data  communications  which  has  evolved  over  the 


past  ten  years  is  called  packet  switching.  As  with  message  switching, 
each  message  is  subdivided  into  blocks  of  data,  called  packets,  for  cir- 
cuit transition.  However,  in  packet  switching  each  packet  has  attached 
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to  it  sufficient  control  information  that  the  packet  can  be  communicated 
across  a network  independent  of  other  packets  belonging  to  the  same 
message  (Tigure  ?).  The  following  example,  whose  notation  can  be  traced 
to  Kleinrock  (50),  describes  how  a message  might  traverse  a packet 
switching  network. 

Consider  a five  packet  message  (Figure  3)  which  must  traverse  the 
six  switch  (node)  network  from  A to  F.  The  path  of  each  packet  can  be 
represented  by  a set  of  ordered  pairs  to  designate  the  circuit  between 
two  nodes,  i.e.,  the  circuit,  P,  from  A to  C is  designated  Pn  = (ac)  and 
the  circuit  from  C to  D is  designated  Pn  = (cd)  such  that  the  correspond- 
ing path  from  A to  D becomes  P^  = (ac,cd),  where  n is  a unique  number 
assigned  to  a particular  packet  traversing  path  P.  Note  that  all  links 
communicate  in  both  directions  such  that  a packet  going  from  C to  A would 
have  path  Pn  = (ca).  Through  a complex  mechanism  of  evaluating  individual 
circuit  and  switch  loads,  packet  switching  computers  are  generally  pro- 
grammed to  dynamically  alter  routing  of  packets  to  the  circuit  of  minimum 
loading.  For  simplicity,  assume  that  unique  numbers  1 through  5 have  been 
assigned  to  packets  one  through  five,  respectively.  The  originating 
switch,  node  A,  may  determine  that  packets  should  be  transmitted  alter- 
nately to  nodes  B and  C such  that  P^  goes  to  B and  P^  goes  to  C,  etc. 

Now  node  B may  determine  that  P-|  and  P,.  should  go  to  D and  P^  should  go  to 
E.  Node  C,  on  the  other  hand,  has  also  sent  P^  and  P^  to  D.  Node  D,  now 
with  four  of  the  five  packets,  dynamically  determines  that  it  can  relieve 
its  load  by  alternately  transmitting  the  packets  to  nodes  E and  F.  The 
end  result  is  that  packets  arrive  at  F out  of  sequence.  This  presents  no 
problem,  however,  since  node  F has  been  programmed  to  hold  final  delivery 
of  the  message  until  the  entire  message  is  received.  If  node  F or  any 


Example  of  packet  switching  network. 
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node  between  A and  r received  a ga . »1 rd  packet,  the  sending  node  will  be 
rogues  I ed  l.o  retransmit  I in*  packet  However,  il  I has  not  received  all 
el  I ho  packets  alter  ,i  specified  time  interval,  node  A will  he  requested 
to  retransmit  the  missing  packets  having  assumed  that  the  first  trans- 
mission was  lost  in  noise,  equipment  failure,  etc.  The  following  path 
descriptors  for  packets  in  the  example  emphasize  the  flexibility  with 
which  a message  may  traverse  a packet  switching  network: 

P^  = (ab ,bd ,de  ,ef ) > 

P9  = (ac  ,cd  ,df ) , 

P3  = (ab,be,ef) , 

P4  ; (ac,cd ,de,ef) , 

Pc  = (ab,bd,df). 

D 

The  ability  to  immediately  transmit  a packet  without  waiting  for  a 
complete  message  and  the  ability  to  dynamically  adjust  to  varying  load 
factors  minimize  resource  requirements  at  each  node  (58).  Nodes  which 
become  saturated,  however,  are  generally  programmed  to  reject  traffic 
until  their  traffic  load  is  normalized.  The  overall  effect  is  a decrease 
in  system  cost  with  a corresponding  effect  on  user  rates.  Example  net- 
works using  packet  switching  technology  are  ARPANET  (50,72)  and  MERIT 
(5,75). 

| 

2.3  Topological  Classifications 

Network  topology,  as  a means  for  categorizing  data  communications 
networks,  evolved  from  graph  theory.  A detailed  introduction  to  graph 
theory  and  its  relationship  to  computer  science  is  provided  by  Deo  (22). 
Topology  refers  to  properties  of  a network  which  are  independent  of  its 
size  and  shape,  such  as  the  connection  pattern  of  links  and  nodes  (21). 
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Graph  is  the  mathematical  term  for  network,  i.e.,  a collection  of  points 
joined  by  links.  Terms  borrowed  from  graph  theory  include  node,  which  is 
analogous  to  the  network  computer  switch;  l ink,  which  provides  a connection 
between  any  two  nodes  and  is  analogous  to  the  communication  channel;  and 
path,  which  represents  the  physical  media  for  communications  of  intelli- 
gence across  the  network.  This  terminology  forms  the  basis  for  extensive 
research  into  the  various  topological  classifications  of  data  networks, 
i.e.,  centralized,  decentralized,  and  distributed  (Figure  4).  The 
following  paragraphs  provide  the  distinguishing  characteristics  of  each 
of  these  classifications. 

2.3.1  Centralized  Networks 

The  centralized  network  is  essentially  a star  topology,  and  is  the 
simplest  of  arrangements  where  switching  has  been  introduced  into  the 
network  (38).  This  topological  scheme  requires  that  a link  be  dedicated 
for  communications  between  the  central  node  and  each  terminal  during  any 
period  of  operation  for  a terminal. 

The  reliability  of  the  centralized  network  is  highly  dependent  upon 
the  central  switch  (node).  Its  failure  suspends  all  activity  in  the 
network  whereas  individual  link  failures  affect  only  a single  device  per 
link.  Any  significant  increase  in  reliability  requires  duplication  of  the 
switching  function,  a relatively  high  expense  endeavor. 

Geographically  dispersed  terminals  frequently  lead  to  the  use  of 
concentrators  or  multiplexors.  Indeed,  it  is  very  unlikely  that  all 
terminals  would  be  simultaneously  active  in  a system  of  terminals  joined 
to  a central  computer.  A hardware  switch  may  be  justified  where  a sub- 
set of  the  terminals  remotely  connected  to  a central  computer  are  located 
in  the  same  geographical  vicinity  (Figure  5).  The  objective  is  to  obtain 
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more  efficient  link  utilization  at  the  expense  of  an  occasional  delay  in 
turn-around  time.  The  switch  is  referred  to  as  a multiplexor  where  the 
information  transfer  rate  required  never  exceeds  that  formally  demanded 
by  one  link  (21).  Where  the  potential  input  capacity  exceeds  link 
capacity,  the  switch  is  called  a concentrator.  The  latter  of  these 
requires  some  storage  capacity  to  compensate  for  occasions  where  instan- 
taneous input  rates  exceed  link  capacity.  Concentrators  are  also  used  to 
merge  several  low  speed  links  into  one  high  speed  link.  The  switching 
power  of  concentrators  and  multiplexors  is  totally  dependent  upon  an 
operational  central  node. 

2.3.2  Decentralized  Networks 

The  distinction  between  centralized  and  decentralized  networks  lies 
in  the  organization  of  the  switching  function  (Figure  4).  Decentralized 
networks  may  be  viewed  as  an  expanded  centralized  network  possessing 
multiplexor/concentrators  whose  switching  power  is,  to  some  extent, 
independent  of  that  of  any  other  node.  Graph  theorists  would  refer  to 
such  a network  as  a mixture  of  star  and  "mesh"  components,  where  a mesh  is 
a completely  enclosed  region  (22)  (note  that  Figure  4b  contains  a mesh 
component) . 

The  added  reliability  of  decentralized  switching  power  comes  at  the 
expense  of  additional  computers  (nodes)  and  corresponding  connecting 
links.  This  added  reliability  generally  comes  in  some  (though  not  elabor- 
ate) form  of  alternate  routing,  that  is  to  say,  not  every  path  will  be 
duplicated  (21).  Within  reason,  the  reliability  of  networks  can  be  made 
as  high  as  desired,  however,  by  duplicating  paths  with  the  addition  of 
corresponding  links  and  nodes.  The  existence  of  at  least  two  disjoint 
paths  between  every  pair  of  nodes  describes  what  Baran  (6)  refers  to  as 
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the  distributed  network  (Figure  4c),  discussed  in  some  detail  in  the 
following  paragraphs.  It  should  be  noted  at  this  point  that  recent 
papers  (32,35,50,59)  have  dropped  any  reference  to  the  decentralized 
network,  combining  its  description  with  that  of  distributed  networks. 

2.3.3  Distributed  Networks 

Much  has  been  written  about  distributed  networks  since  networking  of 
computers  was  first  envisioned  in  the  early  50 ' s starting  with  telephone 
systems.  Practically  every  paper  published  since  that  time  about  computer 
networks  has  made  some  reference  to  the  distributed  class  of  networks. 

1 iie  evolution  of  this  intriguing  and  complex  maze  of  computers  has  led 
to  consideration  for  networks  with  thousands  and  even  tens  of  thousands 
of  nodes  (39,50).  Therein  lies  much  of  the  on-going  research  on  networks 
today. 

The  distributed  network,  as  originally  envisioned,  consists  of  a set 
of  mesh  subnetworks  where  each  node  is  connected  to  a minimum  of  three 
other  nodes.  Each  node  possesses  the  capability  to  systematically  switch 
between  connected  links  according  to  some  prescribed  routing  algorithm 
(several  of  which  are  described  in  section  2.4)  which  maximizes  the  capa- 
city of  a particular  network.  Distributed  networks  evolved  from  attempts 
to  define  military  communications  systems  suitable  for  operating  in  hostile 
environments  (6,9).  Because  of  the  inherent  reliability  for  continuity 
of  operation,  the  distributed  network,  using  packet  switching  techniques, 
is  generally  considered  to  have  the  greatest  potential  for  networks  of 
the  future.  Much  of  the  ensuing  discussion  in  the  following  paragraphs  is 
attributed  to  recent  work  by  Boorstyn  and  Frank  presented  at  the  First 
Joint  IEEE-USSR  Workshop  on  Information  Theory,  Moscow,  USSR,  December, 

1975  (11). 
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Performance  of  distributed  data  networks  is  characterized  by  para- 

« 

meters  of  cost,  throughput,  response  time,  and  reliability.  The  design 
of  the  network  should  consider  properties  of  its  nodes 'along  with  the 
network's  topological  structure.  Important  considerations  are  (29): 

a)  Node  Characteristics 

1)  Message  handling  and  buffering 

2)  Error  control 

3)  Flow  control 

4)  Routing 

5)  Node  throughput 

6)  Node  reliability 

b)  Topological  Characteristics 

1)  Link  location 

2)  Link  capacity 

3)  Network  response  time 

4)  Network  throughput 

5)  Network  reliability 

For  small  or  medium  sized  networks  (up  to  20  or  30  nodes),  the  typi- 
cal structure  is  fairly  homogeneous  with  identical  hardware  and  software 
at  each  node.  Because  of  high  packet  overhead  and  storage  for  delay 
vectors  in  using  global  routing  procedures,  larger  networks  require  the 
use  of  alternatives  such  as  topologies  with  embedded  hierarchies.  For  a 
two-level  hierarchy,  the  highest  level  can  be  thought  of  as  the  "backbone" 
network  and  the  lower  hierarchy  as  a set  of  subnets  which  access  the  back- 
bone network  through  one  or  more  high-level  nodes  that  act  as  gateways. 
Boorstyn  et.al.  has  classified  four  general  problems  which  must  be  con- 
sidered for  multilevel  networks: 
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a)  Preliminary  clustering  of  user  locations, 

b)  Selection  of  nodal  processor  locations, 

c)  Backbone  topological  design  for  the  upper  levels, 

d)  Local  access  design. 

It  is  item  c,  backbone  topological  design,  which  currently  poses  the 
greatest  challenge  for  designers  of  large-scale  distributed  networks  and 
which  is  currently  receiving  the  most  attention  in  data  network  research. 

2.4  Routing  Classifications 

Extensive  research  has  been  done  on  the  design  and  modeling  of  net- 
work routing  algorithms  by  Prosser  (69,70),  Boehm  and  Mobley  (9) , Dol 1 (23), 
Gerla  (35),  Metcalfe  (62),  McQuillan  (59),  Kleinrock  (50),  and  many  others. 
The  descriptions  contained  herein  are  attributed  primarily  to  the  doctoral 
work  of  Fultz  (32)  who  has  consolidated  an  excellent  taxonomy  for  classi- 
fying network  routing  algorithms  (Table  1).  As  will  be  noted,  there  is 
considerable  overlap  between  classifying  networks  according  to  the  type 
of  routing  algorithms  used  and  according  to  the  nature  of  their  topology. 

Though  not  specifically  identified,  several  of  the  algorithms 
discussed  in  this  section  are  extensions  of  routing  mechanisms  developed 
for  voice  (circuit)  switching  networks  as  they  evolved  in  the  1950's  (6). 
An  extensive  literature  review  indicates  that  since  that  time  little  has 
been  done  to  improve  on  circuit  switching  algorithms  for  data  communica- 
tions purposes.  Therefore,  the  ensuing  discussions  are  devoted  to  a 
review  of  routing  schemes  for  packet/message  switching  networks. 

2.4.1  Deterministic  Algorithms 


Deterministic  routing  algorithms  derive  routes  according  to  some 
given  deterministic  decision  rule  which  produces  loop-free  routing,  i.e., 


Table  I.  Classi fication  of  routing  algorithms  (32) 


Local  Delay  Estimate 
Shortest  Queue  + Bias 
Periodic  Update 
Asynchronous  Update 

3.  Flow  Control 
Isarithmic 

Buffer  Storage  Allocation 
Special  Route  Assignment 
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messages  (packets)  can  never  become  trapped  in  closed  paths.  This  type 
routing  policy  consists  of  four  basic  techniques,  some  of  which  are 
subdivided  into  more  descriptive  categories. 

2. 4. 1.1  Flooding  Techniques 

Flooding  appears  to  be  the  simplest  of  all  algorithms  (9,21,45). 

It  requires  neither  large  storage  for  maintaining  current  routing  informa- 
tion nor  statistical  data  for  building  sophisticated  delay  measuring 
mechanisms,  etc.,  required  by  stochastic  and  flow  control  techniques.  A 
node  must  only  remember  the  link  a message  arrived  over.  It  then  retrans- 
mits the  message  over  all  connected  links  except  the  link  from  which  the 
message  was  received.  After  having  circulated  through  the  network  for  a 
prespecified  period,  a message  is  then  returned  to  the  originator  as 
confirmation  that  the  flooding  cycle  has  been  completed  for  that  message. 
In  terms  of  minimum  delay,  flooding  always  produces  an  optimum  routing 
policy  as  the  network  is  placed  in  operation.  After  an  initial  stabil- 
ization period,  it  quickly  succumbs  to  congestion  because  of  the  excess 
traffic  in  the  network.  As  well  as  having  been  proposed  for  various 
military  applications  (23),  it  has  been  suggested  that  flooding  might  be 
used  as  an  initial  "path  finder"  to  derive  routing  and  path  delay  statis- 
tics required  for  other  techniques  (21).  Efficiency  considerations,  in 
general,  though,  rule  out  the  use  of  flooding  techniques  as  a day-to-day 
policy  for  network  routing  (9). 

2. 4. 1.2  Fixed  Techniques 

Fixed  routing  techniques  assume  fixed  topologies  and  known  traffic 


patterns.  Optimal  route  selection  is  then  reduced  to  a mul ti -commodity 
flow  problem  with  well-defined  solution  techniques  (18,32).  Appropriate 


routing  is  generally  obtained  via  a routing  directory  lookup  procedure 
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which  is  fixed  for  a network  configuration  (70).  A routing  directory 
contains  the  link  address  for  communicating  a packet  from  a node  to  any 
location  in  the  network.  By  searching  the  directory  for  a given  destin- 
ation, a cross-reference  is  obtained  to  the  corresponding  link  for 
transmitting  the  packet.  Because  of  their  inflexibility,  fixed  routing 
techniques  do  not  perform  well  in  hostile  environments  such  as  exper- 
ienced in  wartime  conditions.  However,  use  of  alternate,  fixed  routes 
provides  a reasonable  degree  of  survivability  in  the  event  of  hostilities. 

2. 4. 1.3  Split  Traffic  Techniques 

Split  traffic  routing,  sometimes  referred  to  as  traffic  bifurcation, 
allows  traffic  to  /low  on  more  than  one  path  between  a given  source  and 
destination.  Using  the  example  provided  by  Fultz  (32),  assume  that  two 
different  paths  ;i,  (S,D)  and  (S,D),  exist.  Then  a packet  at  node  S 
would  be  routed  over  ($,D)  with  probability  P and  routed  over  (S,D) 

with  probability  1 - P.  When  compared  to  fixed  routing,  split  traffic 
techniques  maintain  a better  balance  of  traffic  throughout  the  network, 
thus  achieving  smaller  average  message  delays. 

2. 4. 1.4  Ideal  Observer  Techniques 

The  ideal  observer  routing  technique  requires  total  knowledge  of  the 
system  at  each  instant  of  time.  "Each  time  a new  packet  enters  a node 
from  its  HOST,  its  route  is  computed  to  minimize  the  travel  time  to  its 
destination  node,  based  upon  the  complete  present  information  about  the 
packets  already  in  the  network  and  their  known  routes"  (32).  Though 
inherent  network  delays  make  the  ideal  observer  technique  of  only 
theoretical  interest,  the  practical  value  of  "total  knowledge"  may  be 
useful  in  providing  an  upper  bound  on  network  performance. 
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2.4.^  Stochastic  Algorithms 

Stochastic  algorithms  operate  as  probabilistic  decision  rules  in 
that  routes  are  selected  utilizing  network  topology,  perhaps  combined  with 
estimates  concerning  the  state  of  the  network.  These  estimates  are  based 
upon  statistically  derived  delay  information  communicated  between  inter- 
connected nodes.  The  delay  information  is  generally  stored  in  a matrix, 
called  a routing  table,  maintained  at  each  node.  A routing  table  is  used 
in  much  the  same  manner  as  the  directory  is  for  split  traffic  techniques 
since  it  contains  delay  information  to  all  other  nodes  in  the  network 
over  each  outgoing  link.  Though  dynamic  in  nature,  the  frequency  of 
table  updates  is  a function  of  many  complex  and  interrelated  considerations 
to  minimize  average  network  traversal  times.  A discussion  of  these  con- 
siderations is  continued  in  Chapter  III.  The  following  paragraphs 
describe  three  categories  of  stochastic  techniques  which  have  been 
identified . 

2. 4. 2.1  Random  Techniques 

Random  routing  algorithms  assume  that  each  node  knows  only  its  own 
identity.  Each  message  is  sent  forward  on  a link  chosen  at  random,  hence, 
eventually  arriving  at  the  destination  in  what  Davies  and  Barber  (21) 
refer  to  as  a "drunkard's  walk".  They  proposed  the  introduction  of  a 
bias  to  guide  the  message  roughly  in  the  right  direction,  but  leaving  a 
substantial  random  element  to  cope  with  possible  link  or  node  failures. 
Although  algorithms  using  pure  random  routing  are  inefficient,  they 
possess  a surprising  amount  of  stability  for  networks  having  high 
probabilities  of  link  or  node  failure  (9,45,69). 
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2.4.2.?  Isolated  Techniques 

The  isolated  routine)  technique,  using  local  delay  estimates,  assumes 
that  traffic  loads  are  approximate  I y equivalent  in  both  directions  between 
any  given  source/destination  pair.  This  technique  is  also  referred  to  as 
"backwards  learning"  since  the  delay  estimate  to  the  source  node  is  based 
upon  combinations  of  transit  times  of  messages  received  from  the  source 
node  over  the  selected  route.  A routing  table  is  formed  at  each  node 
containing  the  most  current  estimated  delay  to  each  source.  When  a 
message  is  to  be  transmitted  to  a particular  node,  the  routing  table  is 
scanned  for  the  minimum  delay  path  (9). 

The  second  of  the  isolated  techniques,  called  the  isolated  shortest 
queue  procedure,  stems  from  early  research  by  Baran  (10)  to  develop  an 
adaptive  routing  method  with  some  degree  of  survivability  in  the  event  of 
hostilities.  His  procedure,  also  referred  to  as  the  "hot  potato"  method, 
requires  that  intermediate  nodes  retransmit  a packet  as  quickly  as 
possible  after  being  received.  Each  node  in  the  network  contains  a ranked 
list  of  lines  leading  to  neighboring  nodes  for  every  destination.  Packets 
are  directed  to  the  highest  rank  free  line  for  a given  destination  or, 
should  all  lines  be  in  use,  to  the  line  containing  the  shortest  queue  (70). 
Though  initially  developed  as  an  adaptive  routing  alternative  for  military 
voice  communications  networks,  Baran's  "hot  potato"  method  is  credited 
with  stimulating  the  research  and  development  of  ongoing  packet  switching 
concepts  (21). 

2. 4. 2. 3 Distributed  Techniques 

The  distributed  class  of  algorithms  relies  upon  nodes  exchanging 
observed  delay  information  with  each  other.  This  approach  introduces  an 
inordinate  amount  of  measurement  information  into  the  network  and  is 


33 


therefore  impractical  for  large  networks.  Modified  procedures  have  been 
proposed  by  Fultz  (33)  using  a "minimum  delay  vector"  and  McQuillan  (59) 
who  advocates  the  "area  approach"  method. 

Fultz  has  each  node  exchange  a minimum  delay  vector  to  each  of  its 
nearest  neighbors.  They  in  turn  update  the  vector  with  their  own  internal 
delays  and  pass  it  to  each  of  their  nearest  neighbors.  The  permutation  of 
transmitting  updated  delay  vectors  between  nodes  eventually  provides  each 
node  with  a matrix  listing  delays  to  all  possible  destinations  via  each 
outgoing  circuit.  Repeated  updates  may  be  prompted  on  either  periodic  or 
aperiodic  basis. 

McQuillan,  on  the  other  hand,  partitions  the  network  into  disjoint 
areas  in  wuich  a particular  node  exchanges  information  with  every  node 
within  its  area.  Information  is  exchanged  with  adjacent  areas  as  though 
each  were  a single  node.  Kleinrock  et  al.  (59)  proposes  an  extension  of 
the  partitioning  scheme  with  the  use  of  m-level  hierarchical  routing 
clusters.  The  objective  of  such  a scheme  is  to  reduce  the  amount  of 
routing  information  that  must  be  retained  at  each  node  for  determining 
which  link  to  transmit  a packet  on.  It  is  anticipated  that  some  form  of 
the  area  approach  will  provide  a logical  solution  to  many  control  problems 
which  currently  exist  for  large  scale  networks. 

2.4.3  Flow  Control  Algorithms 

Congestion  may  occur  within  a network  on  either  a global  or  a local 
basis.  This  congestion  can  be  relieved  by  the  use  of  a hierarchy  of 
protocols  nich  allow  the  selection  of  alternative  actions  based  upon 
information  contained  in  message  and  packet  control  fields.  The  hierarchy 
of  protocols  consists  of: 


1)  Host - 1 o-hos t (message  protocols), 

2)  ource  no<He-to-rles4  :nation  node  (packet  protocols), 

3)  uode-to-node  (link  protocols). 
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The  latter  of  these  a tempts  to  alleviate  congestion  from  node  to  node 
and  is  therefore  local  in  nature.  The  other  two  methods  are  global  in 
nature  since  they  attempt  to  control  traffic  flow  in  the  network.  They 
are  frequently  referred  to  as  end-to-end  methods.  Though  others  undoubt- 
edly exis^,  three  routing  schemes  are  described  in  the  fol lowing  paragraphs 
wh  ich  possess  one  or  more  of  these  congestion  control  protocols. 

2.4.3. 1 ’sarithmic  Technique5: 

An  i arithmic  network  is  one  in  which  the  total  number  of  packets  is 
held  cons'  nt  (20).  This  is  accomplished  by  replacing  data-carrying 
packets  with  dummy  packets.  As  data  is  prepared  for  the  communications 
process,  it  must  capture  a dummy  packet  in  order  to  enter  the  network. 

Such  a threshold  was  believed  to  inhibit  congestion  since  it  had  been 
noticed  that  the  level  of  traffic  within  a network  can  be  expressed  in 
terms  of  the  total  number  of  packets. 

2. 4. 3. 2 Buffer  Storage  Allocation  Technique 

Buffer  storage  allocation  refers  to  a procedure  developed  by  Kahn  and 
Crowther  (44)  in  which  the  .ource  node  requests  allocation  of  message 
reassembly  space  from  the  destination  node  prior  to  release  of  the  message 
for  transmission.  This  procedure  was  proposed  as  a solution  to  prevent 
message  r assembly  lo.r.up  which  occurs  when  nodes  adjacent  to  the  destina- 
tion node  oecome  fill eci  due  to  packets  being  rejected  by  the  destination 
node.  Kahn  first  proposed  that  the  destination  node  discard  packets  which 
cannot  be  accepted  and  then  notify  the  source  node  of  the  action.  The 
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main  problem  with  this  approach  is  that  it  produces  unnecessary  duplicate 
packet  transmissions  in  order  to  communicate  a message.  Advance  alloca- 
tion of  reassembly  buffers,  resulting  in  occasional  transmission  delays, 
was  concluded  to  be  more  efficient  than  recovering  from  discarded  packets. 

Though  easy  to  implement,  neither  procedure  is  considered  adequate  for 
the  real-time  or  data-sharing  users. 

2. 4. 3. 3 Special  Route  Assignment  Techniques 

Kahn  and  Crowther  (44)  proposed  an  alternative  to  the  buffer  storage 
allocation  technique  in  the  form  of  assigning  special  routes.  The  assign- 
ment of  these  routes  by  a node  is  based  upon:  1)  status  information 
received  from  adjacent  nodes  and  2)  traffic  patterns  encountered  by  the 
node  over  the  past  several  seconds.  Thresholds  are  established  to  prevent 
the  use  of  alternate  routes  in  response  to  rapid  changes  in  traffic  flow. 

This  is  accomplished  by  combining  measurements  on  the  rate  of  change  of 
traffic  on  each  path  with  a predefined  interval  of  time  before  alternate 
routing  can  be  established.  As  a result,  changes  to  primary  routing  may 
occur  only  in  response  to  a sustained  demand  for  higher  throughput. 

Table  2 provides  specific  properties  of  the  flow  control  routing  algorithm. 

2.5  Discussion 

In  1975,  Rudin  (73)  proposed  a classification  scheme  which  combines 
routing  techniques  with  topological  categories.  He  states  that  all 
routing  algorithms  fall  naturally  into  one  of  the  following  classifica- 
tions, all  but  one  (centralized,  type  2)  of  which  has  previously  been 
described  in  one  form  or  another: 
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Table  II.  Properties  of  special  route  assignment  technique  (44) 


1.  The  routing  selection  is  performed  independently  by  each  node, 
based  upon  information  received  from  adjacent  nodes  and  traffic 
patterns  encountered. 

2.  The  algorithm  attempts  to  guarantee  that  individual  routing 
decisions  possess  global  continuity  for  the  network. 

3.  Interval  (synchronous)  updating  of  routing  tables  always  occurs 
and  dynamic  (asynchronous)  updating  will  occur  where  justified. 

4.  In  selecting  routes,  the  network  is  decomposed  into  a union  of 
identical  and  overlapping  subnetworks  with  separate  routing 
computed  for  each  subnetwork. 

5.  For  an  unloaded  net,  links  are  selected  which  result  in  transiting 
the  fewest  nodes  to  the  destination. 

6.  For  a loaded  net,  traffic  is  diverted  from  fully  occupied  links 
whenever  possible. 

7.  Changes  in  routing  will  occur  only  due  to  the  sustained  flow  of 
traffic  according  to  a new  traffic  pattern. 

8.  Additional  paths  will  be  established  for  a given  destination 
which  will  allow  individual  packets  to  depart  on  separate  links. 

9.  Traffic  flow  on  any  link  in  a subnetwork  may  occur  in  only  one  of 
the  two  directions  at  a time  (half-duplex  operation). 

10.  Directions  of  flow  may  change  infrequently  only  after  passing 
through  a neutral  state  for  a short  interval  of  time. 

11.  The  maximum  allowed  traffic  through  each  node  in  a subnetwork  is 
regulated  so  as  to  change  slowly.  This  provides  stability  and 
allows  adjustments  for  increased  traffic  flow. 

12.  For  each  subnetwork,  loops  in  routing  are  quickly  detected  and 
broken. 


A.  Centralized  Techniques 

1 . Type  1 - fixed 

2.  Type  2 - network  routing  control  center 

3.  Type  3 - ideal  observer 

B.  Distributed  Techniques 

1.  Type  1 - co-operative , periodic  or  asynchronous  routing 
table  updates 

2.  Type  2 - isolated,  local  delay  estimates 

3.  Type  3 - isolated,  shortest  queue 

4.  Type  4 - random 

5.  Type  5 - flooding. 

Type  2 centralized  techniques  use  a netwcrk  routing  center  (NRC) 
which  periodically  accepts  update  traffic  load  information  from  each  net- 
work node.  The  NRC  uses  this  information  to  regenerate  routing  tables 
which  then  remain  fixed  until  the  next  update.  The-  primary  disadvantages 
of  this  technique  are:  1)  the  single  network  switchpoint  (NRC)  where  the 
the  routing  strategy  can  change  between  any  two  packets  and  2)  the  con- 
strained system  behavior  due  to  single  paths  for  each  source-destination 
node  pair.  As  a result,  Rudin  (74)  concludes  that  the  use  of  the  NRC 
approach  causes  instability  in  well-balanced  networks. 

Centralized  routing  techniques  have  been  extensively  implemented  and 
perform  well  within  their  natural  constraints.  The  high  propensity 
towards  total  system  collapse  due  to  failure  of  the  central  control 
facility  is  an  example  of  a natural  constraint  of  centralized  control 
networks.  Another  is  the  inherent  inflexibility  for  adjustments  to  load 
variations.  In  general,  experience  has  shown  that  they  experience  less 
delay  than  distributed  routing  algorithms  under  stable  traffic  flows 


(32,35,70).  The  properties  of  a highly  centralized  network  are  well 
known  and,  thus,  raise  few  radically  new  problems  beyond  the  existing 
technology  of  computer-communications  systems  (43,59,74). 
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In  evaluating  the  relative  strengths  and  weaknesses  of  centralized 
and  distributed  routing  techniques,  Rudin  (73)  proposed  a hybrid  procedure 
referred  to  as  "delta  routing".  In  this  scheme,  topological  decisions  are 
divided  into  two  categories.  Decisions  having  only  local  network  impact 
are  implemented  at  the  node  level  while  decisions  having  global  impact  are 
implemented  via  the  network  routing  control  center.  Though  it  appears 
to  take  advantage  of  the  more  favorable  aspects  of  both  classes,  delta 
routing  still  suffers  from  the  weaknesses  introduced  by  the  requirement 
for  a central  control  facility. 

Static  (deterministic)  routing  strategies,  exemplified  by  type  1 of 
the  centralized  algorithms,  provide  optimal  routing  but  are  based  on  the 
assumption  of  total  reliability  and  fixed  load  patterns.  In  general 
these  assumptions  make  the  static  scheme  unusable  except  for  analytical 
purposes.  The  obvious  solution  is  an  adaptive  routing  policy  (exemplified 
by  types  1 , 2,  and  3 of  the  distributed  routing  algorithms)  in  which 
changes  in  routing  decisions  are  based  on  periodically  updated  information 
about  the  best  routes  to  each  destination  (7).  Hence,  adaptive  routing 
strategies,  which  take  advantage  of  a knowledge  of  the  current  system 
state,  have  generally  been  used  in  networks  with  non-homogeneous  hosts 
or  large  aggregates  of  nodes  and  links. 


Distributed  routing  algorithms  suffer  from  two  major  shortcomings: 

1)  looping  and  2)  a tendency  to  route  all  messages  to  a given  destination 
through  a single  neighboring  node.  Looping  is  characterized  by  a message 
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which  repeatedly  traverses  the  same  set  of  nodes.  It  may  occur  as  a 
result  of  deficient  interprocess  communication  (61)  or  unfortunate  timing 
(7).  Though  loopless  paths  for  centralized  routing  have  been  in  existence 
for  some  time  (15)  only  recently  have  solutions  to  loopless  distributed 
routing  been  proposed  (65). 

Traffic  flow  measurements  such  as  priority  pricing  (55)  and  commodity 
flow  have  become  useful  tools  to  provide  optimality  for  a routing  policy 
(2,30,48).  These  techniques,  adapted  from  the  management  sciences,  can 
be  used  to  analyze,  contrast,  and  improve  on  network  routing  algorithms. 
Simulation  has  also  been  employed  to  derive  traffic  patterns  under  assumed 
distributions.  Simulation  statistics  have  been  compared  with  results  of 
traffic  flow  measurements  to  determine  validity  of  efficiency  predictions 
for  various  routing  algorithms  (27,47). 

The  distributed  class  of  routing  algorithms  using  co-operative, 
periodic  or  asynchronous  updates  (type  1),  possibly  with  some  bias  term, 
appears  to  provide  substantial  advantages  over  other  current  network 
routing  procedures.  It  is  the  use  of  type  1 schemes  for  large  scale 
networks  which  offers  one  of  the  greatest  challenges  to  the  data  communi- 
cations industry. 
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CHAPTER  III 

THE  USE  OF  QUEUING  THEORY 
TO  MODEL  NETWORKS 

3.  Introduction 

Analytic  methods  involve  schemes  for  load  leveling  and  algorithms  to 
control  system  saturation  (19).  Thus,  networks  must  be  analyzed  in  their 
entirety  instead  of  being  considered  as  a collection  of  disjoint  message 
switches  (nodes).  Load  leveling  involves  assessment  of  constantly 
changing  loads  and  deciding  upon  an  appropriate  path  for  minimizing 
communications  cost,  both  to  the  user  and  to  the  network  operator. 
Similarly,  when  a message  enters  a network,  the  destination  node  must  be 
determined  so  that  goals  such  as  economy  and  minimum  delivery  times  may 
be  met.  However,  a more  important  problem  is  the  task  of  defining  a 
general  network  model  representative  of  the  many  intracacies  of  the  net- 
work communications  process. 

The  simplest  network  consists  of  a single  node.  The  flow  through 
the  node  is  determined  by  the  arrival  pattern  of  the  messages  which  must 

pass  through  the  node  and  the  service  pattern  of  the  node.  Consider  the 

classical  example  of  the  one-man  barbershop  which  does  not  make  appoint- 
ments and  services  customers  on  a first-come-first-served  (FIFO)  basis 
(37).  If  customers  arrive  at  a uniform  rate,  x , and  the  barber  can  cut 

hair  at  a uniform  rate,  u,  then  X <_ y means  that  the  barber  will  be  able 

to  service  all  of  his  customers.  The  remaining  case,  X > p,  implies  that 
customers  arrive  faster  than  the  barber  can  cut  hair  and  a waiting  line 
or  queue  wi 1 1 form.  Subsequently,  the  queue  will  grow  without  bound  until 
the  barber  closes  shop. 
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In  contrast  to  uniformly  distributed  arrivals,  assume  that  arrivals 
are  Poisson  distributed  about  mean  A.  Similarly,  assume  that  service  is 
exponentially  distributed  around  mean  y.  Just  as  with  the  uniform  distri- 
butions, it  can  be  stated  that  for  A > y the  waiting  line  will  grow  with- 
out bound.  Stated  in  network  terminology,  on  the  average,  users  are 
generating  messages  for  delivery  faster  than  can  be  delivered  by  the 
network . 

The  basic  approach  to  analysis  involving  queuing  systems  has  been 
by  decomposition.  In  this  approach,  complex  networks  are  reduced  to  a 
set  of  single  server  problems.  Upon  completing  single  server  analysis, 
the  results  are  synthesized  to  form  a composite  solution  which  reflects 
the  properties  of  the  original  network  Justification  for  the  decom- 
position-synthesis process  has  been  provided  by  Jackson  (41)  who  showed 
that  if  the  Markovian  assumptions  of  independent,  exponentially  distri- 
buted arrival  and  service  times  hold,  and  if  the  system  is  stable,  each 
node  may  be  treated  as  a separate  single  server  queue.  His  research  was 
later  enhanced  by  Burke  (13)  who  demonstrated  that  systems  having  expo- 
nentially distributed  arrival  and  service  times  produce  exponentially 
distributed  output  traffic.  Burke  subsequently  showed  that  traffic  flow 
between  nodes  is  a Poisson  process. 

However,  since  messages  maintain  their  length  throughout  the  commun- 
ications process,  discrepancies  arise  between  the  interarrival  and  service 
times,  thus  violating  constraints  on  the  queuing  model.  Kleinrock  (45), 
though,  proposed  the  "independence  assumption"  by  demonstrating  that  the 
interaction  between  nodes  in  a network  havinq  sufficient  connectivity 
causes  the  service  and  interarrival  time  correlation  to  disappear.  The 
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combined  research  of  Jackson,  Burke,  and  Kleinrock  has  established  a 
viable  basis  for  assuming  Poisson  arrivals  and  exponential  service  tor 
modeling  networks. 

3.1  Network  Characterization 

One  of  the  most  accepted  measures  of  network  performance  is  average 
message  delay  (45,46,47,60)  introduced  in  Chapter  II.  Average  message 
delay  has  also  proven  to  be  the  most  amenable  metric  for  mathematical 
analysis.  Analysis  based  upon  this  metric  is  frequently  referred  to  as 
the  minimjm  cost  routing  problem  (79).  It  provides  the  impetus  for  the 
theoretical  characterization  of  the  network  as  described  in  the  following 
paragraphs . 

Assume  a distributed  network  defined  by  the  following  parameters: 

a)  N nodes  and  M links, 

b)  an  NxN  traffic  matrix  R = [r^  .]  where  r^  represents  the  traffic 
density  between  nodes  i and  j,  and 

c)  a cost  function,  f..(y..),  relating  the  total  cost  of  sending  a 

■ J ^ J 

message  from  node  i to  node  j to  the  combined  traffic  loads,  y^ , at  node 
i and  node  j . 

The  objective  of  the  routing  problem  is  to  accommodate  the  density 

of  the  traffic  matrix  while  minimizing  the  total  network  cost  defined  as 
M 

T-  - \ f (yv)  where  x implies  there  exists  an  arc  from  node  i to  node  j. 

x=l  x x 

The  notation  y^  is  defined  to  be  the  sum  of  all  point-to-point  flows  from 
node  i to  node  j . 

All  inputs  to  a node  are  first  placed  in  a service  queue  prior  to 


being  serviced  by  the  node.  If  the  link  selected  for  transmission  is 
busy,  messages  are  subsequently  placed  in  a transmission  queue  awaiting 
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transmission.  For  practical  reasons,  the  lengths  ot  service  and  trans- 
mission queues  must  be  finite.  Therefore,  if  the  node  is  near  saturation, 
a portion  of  the  arriving  messages  will  be  turned  away  for  lack  of  space. 
However,  since  steady-state  solutions  are  desired,  finite  queue  assump- 
tions are  not  restrictive.  Each  node  has  an  exponential  service  time  of 
u..  and  operates  in  the  full  duplex  mode.  As  a result,  messages  are  not 
required  to  wait  for  a free  receiving  node  and  no  blocked  state  is 
assumed. 

The  input  stream  to  the  jth  node  consists  of  three  types:  Type  "a" 
messages  are  those  originating  from  users  directly  serviced  by  node  j, 
type  "b"  messages  pass  through  node  j enroute  to  some  destination  node 
i t j,  and  type  "c"  messages  are  destined  for  a user  directly  serviced 
by  j.  Based  upon  Burke's  conclusions  described  earlier,  type  "a"  messages 
are  assumed  to  arrive  at  node  j from  a Poisson  distribution  with  an 
average  arrival  rate  of  y..  The  arrival  of  type  "b"  and  "c"  messages 

vJ 

(not  necessarily  Poisson  (19,49))  is  denoted  as  x. . 

Assume  for  the  moment  that  all  messages  have  the  same  priority  and 
that  node  j is  receiving  input  from  locally  attached  network  users  and 
from  adjacently  attached  nodes.  Then  the  true  arrival  rate  for  node  j is 
given  by  (49) 

X 

i^j 

where  y.  and  x.  are  as  described  above,  n.  is  the  number  of  nodes 
J ' J 

adjacent  to  node  j,  and  p..  is  the  probability  that  a message  is 

' J 

transmitted  from  node  i to  node  j.  Figure  6 illustrates  the  queue 


- Y j 


1 A-jP-j  i > 

i=l  1 1J 


(3.1) 
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structure  for  the  jth  node  with  a service  rate  of  u . and  departure  rate 
of  A j _ P j -j  where  double-subscript  j identifies  node  j's  type  "b"  and  "c" 
messaye  input  rate  to  adjacent  node  i. 

Equation  3.1  implies  the  existence  of  a network  probability  matrix 
of  the  form 

f 0 P 1 2 • * • Pln 


P = [p.  .] 

L 1JJ 


P21 


N 1 


N-i  ,n 


PN,n-i 


"i 


where  1 _<  i <_  N,  1 < j < n.,  and  T p. . = 1 V-  i.  The  value  of  p.  . e P is 

1 j=l  1J  1J 

determined  by  the  probability  that  a message  is  at  node  i (equal  to  the 

product  of  the  probability  that  a message  exists  in  the  network  and  the 

probability  that  the  message  is  being  serviced  by  node  i)  combined  with 

the  cost  function,  f..(y.  .),  of  sending  a message  through  node  i to  node 

I J I J 

j in  route  to  its  destination.  More  precisely,  p^  = p^ (k^)f^ .(y^j) 
where  p ^ ( k ^ ) is  equivalent  to 

P.(K)  . .. 

N 

I PJV 

s=l  * x 

i 

and  K is  the  N-node  network  state  vector,  (k:,  k? , ....  ) , describing 

the  number  of  messages  in  each  node  of  the  network  (41).  It  is  the  com- 
parison of  various  cost  functions  for  large  scale  networks  which 
receives  emphasis  in  later  chapters. 


J 


In  summary,  the  following  has  been  assumed: 

1.  Input  Population:  An  infinite  input  population  with  a Poisson 
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arrival  rate  of  yj  is  assumed  for  messages  arriving  into  the  network  at 
node  j.  Also,  an  arrival  rate  of  A ^ is  assumed  for  messages  arriving  from 
nodes  connected  to  j.  The  arrival  rate  for  the  jth  node  is  therefore 
described  as 

A 

i^j 

n^  equal  to  the  number  of  adjacent  nodes  for  node  j. 

2.  Service  Mechanism:  Each  node  acts  as  a single  server  with 
service  rate  p.. 

J 

3.  Queue  Discipline:  Entrance  to  each  queue  is  FIFO.  The  length 
of  the  service  and  transmission  queues  combined  is  necessarily  finite. 

4.  Routing  Procedure:  This  is  derived  from  the  network  probability 


j ‘ Yj 4 


matrix,  P,  where  p..  e P denotes  the  probability  that  a message  is  sent 
* J 

from  node  i to  node  j . 


3.2  Model  Description 

Jackson  (41)  defines  a state  variable  for  an  N-node  network  as 
K = (kx,  k2,  ....  k^)  where  k..  is  the  number  of  customers  in  the  ith 
node.  Let  p(klf  k2,  ...,  k^)  be  the  equilibrium  probability  associated 
with  state  K.  Jackson's  theorem  demonstrates  that 

P(ki,  k2 kN)  = P1(k1)p2(k2)  ...  pN(kN) 

and  that  p^k^)  is  the  solution  to  the  M/M/C  queuing  system  (using 
classical  Kendall  notation).  The  system  descriptors  for  the  M/M/C  system 
are  well  known  (37)  and  the  solution  to  the  M/M/C  system  is  provided  as 


Appendix  A.  A visual  interpretation  of  the  M/M/C  system  is  given  in 
Figure  7 . 
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Figure  7.  State-transition-rate  diagram  for  M/M/C  (49) 
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An  alternative  and  more  widely  accepted  approach  is  a decomposition- 
synthesis  as  modified  by  Kleinrock's  independence  assumption.  This  tech- 
nique allows  network  analysis  using  the  basic  single  server  queuing  model 
(M/M/1).  First,  assume  a Poisson  distribution  of  interarrival  times  as 
A(t)  and  a distribution  of  service  times  as  B(t).  The  expected  system 
waiting  time  is  derived  as 

“ = ?rr^T* 1 (3-2) 

where  X is  the  average  arrival  rate,  £ and  l2  are  the  first  and  second 
moments  of  B(t)  respectively,  and  p = Xt  < 1 is  the  utilization  factor. 
When  B(t)  is  also  exponential,  the  equation  reduces  to 

W = y “ . (3.3) 

i - p 

Decompose  the  network  into  a series  of  single  server  problems  which 
reflect  the  original  network  structure  and  traffic  flow.  Let  Z. . repre- 

* J 

sent  the  expected  delay  for  packets  with  origin  node  i and  destination 
node  j.  Also  assume  an  average  of  y..  packets  per  second  and  that  1/p 

* J 

represents  an  exponential  distribution  of  packet  lengths  in  bits  per 
packet.  With  these  assumptions,  y is  defined  to  be  the  sum  of  the 
quantities  y.j  and 

* J 
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w = z...  (3.4) 

ij Y 1J 

Equation  3.4  is  now  reformulated  in  terms  of  single  channel  delays. 
Let  represent  the  capacity  of  the  kth  channel  in  bits  per  second,  A^ 
represent  the  average  packet  traffic  carried  by  channel  k in  packets  per 
second,  and  be  the  expected  time  that  a packet  will  wait  for  and  use 
the  channel . In  relating  the  A.  to  the  y..  over  the  path  selected  by 

K 1 J 

the  routing  algorithm,  one  can  see  that 

"-IvV  ;3'5) 


Using  equation  3.3  with  t = 1 / ( uC^ ) , each  quantity,  W^,  is  determined 
to  be 


Note  that  as  pC^  - a^  approaches  zero,  approaches  infinity  and,  thus 
^ k 

>>  i(  is  necessary  for  to  be  reasonable.  The  network  has  now  been 
decomposed  into  a set  of  single-channel  problems  as  described  by  equations 
3.5  and  3.6.  However,  these  equations  must  be  adjusted  to  account  for  the 
general  (non-exponential)  packet  length  used  by  the  simulation  supporting 
the  research  of  this  dissertation. 

Using  the  famous  Pollaczek-Khinchin  (P-K)  formula  (37)  for  the 
general  service  pattern  (M/G/l),  the  average  waiting  time  is  determined  as 


p2  + A2o2  j 

= 2A  (T~-T T + W 


(3.7) 


where  a2  is  the  variance  of  the  service-time  distribution  and  p is  the 
utilization  factor.  A/p.  If  1/p  is  set  to  be  constant,  the  model 


possesses  a deterministic  service  pattern  (M/D/1)  with  o2  = 0.  Therefore 
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equation  3.7  simplifies  to 

n2 


w = 


(3.8) 


2A(1  - p)  p • 

The  P-K  results  are  adjusted  for  channel  capacity,  Ck,  by  allowing 
pk  = x|/(^k)  w^ch>  after  the  appropriate  manipulation,  yields  average 
waiting  time  per  channel  as 


W, 


. 2-pk 

2K  - V • 


(3.9) 


However,  in  systems  with  packet  lengths  of  general  distribution,  average 


waiting  time  per  channel  is  calculated  with  a2  > 0 resulting  in 

2 - pk(l  - u2a2) 

Wk  = 2 ( pC.  - A.  ) ' 


(3.10) 


This  equation  is  necessary  to  account  for  the  differences  in  packet 
lengths  between  user  messages  (528  bits),  minimum  delay  packets  (480  bits) , 
and  acknowledgement  packets  (118  bits).  Equations  3.9  and  3.10  are 
proposed  as  approximations  with  the  assumption  that  the  Markovian 
character  of  the  traffic  flow  has  been  preserved  (31). 

The  effects  of  propagation  time  and  overhead  traffic  are  included 
by  the  following  argument.  Let  1 /p ’ represent  the  average  length  of 
user  traffic  (excluding  acknowledgements,  headers,  requests  for  next 
messages,  parity  checks,  etc.)  and  let  1/u  represent  the  average  length 
of  all  packets.  Total  system  delay  is  determined  by 


W = K 


VpCk 

pCk  ‘ Xk 


+ Pk  + K] 


(3.11) 


where  K is  the  nodal  processing  time  (assumed  constant),  Pk  is  the  pro- 
pagation on  the  kth  channel  and  ( Ak/yCk)/(yCk  - Ak)  is  the  average  time  a 
packet  waits  at  a node  for  the  kth  channel  to  become  available. 
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In  order  to  minimize  simulation  costs,  certain  elements  of  equation 
3.11,  in  addition  to  packet  lengths,  are  set  constant.  They  are: 

1.  A constant  nodal  processing  delay,  K,  of  10"3  seconds  at  each 
node  traversed. 

2.  A channel  propagation  delay,  Pk,  consisting  of  eight  ysec  per 
circuit  mile  and  a fixed  modem  delay  of  seventy  ysec  per  channel. 

3.  A fixed  channel  capacity  of  9600  bits  per  second. 

4.  A fixed  node-host  link  capacity  of  256  kilobits  per  second. 

5.  A constant  user  packet  length,  1/y’,  of  528  bits. 

Defining  l_k  to  be  the  length  of  line  k,  total  delay  can  now  be  calculated 
as 

A,  A,/9600y 

W = l — [0.055  + 96^0-pr'A-  + P|<  + 10’3^  + 10"3  + 2/256000u  (3.12) 

k k 

where  Pk  = Lk(8xl 0~& ) + 7xl0'5.  (3.13) 

This  completes  the  network  delay  analysis  using  the  decomposition- 
synthesis  technique. 


where  x represents  connectivity  from  node  i to  node  j and  M is  the  total 

number  of  links  in  the  network.  The  cost  function  for  any  particular 

link  in  the  network  can  therefore  be  designated  f^(y^)  where  y^ 

represents  the  combined  traffic  loads  at  node  i and  node  j.  This  stction 

contains  an  analytical  description  of  the  cost  function  and  its  use  in 

determining  the  respective  values  p. . e P,  the  network  probability  matrix. 

* J 


In  determining  which  outgoing  link  to  select  for  transmitting  a 
packet  from  a node,  say  node  i,  consider  the  network  (or  graph)  being 
converted  into  a tree  with  node  i as  the  root.  This  is  accomplished 
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by  eliminating  all  parallel  paths  from  the  network  starting  with  a search 
from  node  i.  Any  node  intersected  by  more  than  one  path  is  considered  to 
be  a separate  vertex  within  each  path.  Also,  paths  are  eliminated  which 
converge  back  to  node  i or  which  intersect  other  nodes  of  the  same  level 
as  the  current  node  being  evaluated.  Consistent  with  previous  definitions, 
let  n represent  the  maximum  number  of  outgoing  links  possessed  by  any 
node  in  the  network  and  n^  be  the  actual  number  of  links  emanating  from 
node  i.  Finally,  define  i to  be  the  lowest  level  in  the  tree,  which,  for 
practical  reasons,  is  constrained  never  to  exceed  four  in  the  simulation 
model.  Furthermore,  to  reduce  search  time,  i is  dynamically  reduced 
whenever  a destination  node  is  encountered  prior  to  reaching  level  n in  a 
search  of  the  tree.  Label  each  node,  starting  with  i = 1,  using  a depth- 
first  traversal  from  left  to  right. 

The  cost  of  a path,  v,  whose  root  is  node  i is  determined  to  be 

c,v  ■ e(6)  * j0  tVkWg.kj/Vk.j^-  (3J4) 

1 < v <_nn,  1 j < n., 

where  e is  a function  of  the  rectangular  coordinates,  e,  of  the  leaf  node 
(Figure  8)  and  the  destination  node,  and  g is  a weighting  function  which 
inverts  increasing  powers  of  depending  upon  the  algorithm  evaluated. 

The  cost  function  for  connectivity  between  i and  j is  defined  as 

fx(yx)  = “x  + 6x  + r 


(3.15) 


Figure  8.  Tree  representation  of  equation  3.14 
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where  = distance  in  miles  times  eight  microseconds  per  mile  propagation 
delay, 

8x  = combined  queue  lengths  in  packets  times  one  millisecond 
processing  time,  and 
r = 70  microsecond  modem  delay. 

Routing  algorithms  investigated  by  this  research  select  for  transmission 
the  link  which  corresponds  to  the  first  level  in  a path  whose  cost  is 
defined  by: 

MIN  (c.  , 1 < v < *n}.  (3.16) 

v 

Finally,  with  these  definitions  a better  interpretation  of  the 

probability  matrix,  P,  is  available.  Specifically,  p..  e P is  now 

■ J 

proposed  as: 

ln 

Pu  = c.  I / l c.  , 1 < i < N,  1 < j < n.  (3.17) 

1J  ’j  w^O  v=l  7v  7 

where  Y p,^  is  the  probability  of  one  or  more  messages  in  the  network 

Wf0 

located  at  node  i and  N,  as  defined  in  section  3.1,  is  the  number  of  nodes 
in  the  network.  The  network  cost  function  and  corresponding  network 
probability  matrix  are  now  explicitly  defined. 


3.4  The  Minimum  Cost  Flow  Problem 

Attention  is  now  turned  to  determining  an  optimal  routing  algorithm 
as  a function  of  cost.  A three  phase  argument  is  presented.  First,  an 
algorithm,  B*,  is  defined  to  be  a minimum  cost  algorithm  if  it 
selects  the  minimum  cost  path  for  any  node  j adjacent  to  the  root 
node  i in  any  tree  t^.  The  second  phase  establishes  an  ordered  search 
algorithm  and,  subsequently  an  optimal  search  algorithm,  A*,  for  the 


i 
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optimum  path  of  a network.  Care  should  be  taken  to  maintain  a distinction 
between  a path,  as  used  in  searching  a tree,  from  path P,  as  used  in  tra- 
versing a network.  The  third  and  final  phase  relates  the  minimum  cost 
and  optimum  search  algorithms  to  derive  optimum  routing. 

3.4.1  Determining  the  Path  of  Least  Cost 

Let  there  exist  a tree,  t.  e T = (tj,  t2,  ....  t^} , where  T repre- 
sents a set  of  trees  constructed  from  an  N node  network  as  defined  in 
section  3.3.  Level  is  defined  to  be  the  maximum  level  in  and  is 
equivalent  to  the  search  depth  described  in  section  4.1. 

Definition  3.1:  For  any  t^ , a minimum  cost  algorithm,  B*,  is  one 

which  selects  MIN  Ic.  , 1 1 v < 8.nl  where  v is  a path  passing  through 

v 

the  root  node  i of  t^ . 

Lemma  3.k  A minimum  cost  algorithm  which  always  selects 

[MIN  (c.  , 1 £ v < xn}V-i]  is  a minimum  cost  algorithm  Vt  e T. 
v 

Proof:  The  proof  is  clear  from  Definition  3.1  if  all  t are  con- 
structed in  the  same  manner  and  if  equation  3.14  holds  V-i . 

3.4.2  Determining  an  Optimum  Traversal 

The  following  discussion  builds  the  basis  for  an  ordered  search 
algorithm  which  is  used  in  determining  the  minimum  cost  path  in  a network. 
Let  there  exist  some  function  <S  that  could  be  used  to  dynamically  order 
T for  evaluation  and  let  6 ( x ) denote  the  value  of  the  function  at  node  x. 
Order  all  nodes  adjacent  to  x in  increasing  order  of  their  6 values.  An 
ordered  search  algorithm  can  be  derived  which  next  selects  for  evaluation 
that  tree  corresponding  to  the  adjacent  node  with  the  smallest  <S.  The 
ordered  search  algorithm  consists  of  the  following  steps  (66): 
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1.  Assume  that  a network  is  in  state  S and  that  a message  is 

originated  at  node  x , the  root  of  tree  t . 

- xs 

2.  Compute  <s(x.  ),  1 < j < n , where  n represents  the  number  of 

J v x x 

arcs  emanating  from  j represents  the  nodes  adjacent  to  x$,  and  v 
represents  the  paths  passing  through  each  j. 

3.  Select  that  link  for  transmission  which  corresponds  to  the 

smallest  <sVxj . Map  the  system  into  state  S = S + 1 by  transmitting  the 

message  over  the  selected  link  and  renaming  the  receiving  adjacent  node 

x , the  root  of  a new  tree,  t . 

xs 

4.  If  the  new  x^  is  the  destination,  then  the  search  is  complete. 

5.  If  not,  go  to  2. 

3. 4. 2.1  An  Optimal  Search  Algorithm  (66) 

Define  6(x)  as  an  estimate  of  the  cost  of  a minimal  cost  path  for 
a message  constrained  to  go  through  node  x.  Also,  let  the  function 
n(x.j,Xj)  give  the  actual  cost  of  a minimal  cost  path  between  two  arbitrary 
nodes  x.  and  x..  If  D is  a set  of  all  possible  destination  nodes,  the 

\J 

cost  of  a minimum  cost  path  from  xi  to  a destination  is 

C(x.)  = MIN  n ( x . , x . ) . (3.18) 

x.eD  1 J 
J 

By  definition,  any  path  from  node  to  a destination  node  that  achieves 

c(x^)  is  an  optimal  path  from  x^  to  a destination. 

Let  s(x)  = n(xs,x)  represent  the  actual  cost  of  an  optimal  path 

from  a start  node  to  some  arbitrary  node  x.  The  actual  cost  of  an 

optimal  path  from  a start  node  x to  a destination  node  x.  constrained 

^ J 

to  go  through  x is  therefore  defined  as 


<5(x)  = S(x)  + c(x). 


(3.19) 


i 
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Let  an  estimate  of  <S(x)  be  represented  by 

<s(x)  = /.  (x ) + 4 (x ) (3.20) 

where  4 is  an  estimate  of  4 and  4 is  an  estimate  of  4.  An  obvious 
choice  for  4(x)  is  given  by  the  sum  of  the  costs  of  arcs  from  x$  to  x. 
This  implies  that  c(x)  ^4(x).  An  estimate  of  c(x)  shall  be  obtained 
by  the  evaluation  of  any  heuristic  information  available  from  the  problem 
domain.  Using  these  definitions,  let  there  exist  an  ordered  search 
algorithm.  A*,  which  uses  equation  3.20  as  an  evaluation  function. 

34.2.2  The  Admissibility  of  A*  (66) 

Let  a routing  algorithm  be  admissible  if  for  any  network  a message 
always  reaches  its  destination  in  an  optimum  path,  P.  The  following 
arguments  show  that  if  4 is  a lower  bound  on  4,  then  A*  is  admissible. 

Lemma  3.2:  If  4(x)  = 4(x)¥x,  then  before  A*  terminates  and  for 
any  P from  node  x$  to  a destination  node,  there  exists  an  unevaluated 
(open)  node  x‘  on  P with  6(x')  £ 6 (xs ) (66). 

Proof:  Let  P be  represented  by  an  ordered  sequence,  n = Xj,  x2,  x3, 
...,  x^,  where  x^  is  a destination  node.  Now,  there  exists  x'  e n, 
since  if  x^  is  closed.  A*  has  terminated.  By  definition  of  6, 

S(x')  = 4(x')  + dx').  (3.21) 

Since  x'  is  on  P and  all  of  its  antecedents  on  P are  closed,  A*  can  be 


said  to  have  found  an  optimal  path  to  x'.  Therefore  4(x')  = 4(x‘)  and 
6(x')  = 4(x')  + c(x  ' ) . (3.22) 

Now,  assuming  4(x‘)  £4(x'),  it  can  be  stated  that 

6(x')  < C(x')  + 4(x')  = 5(x').  (3.23) 

However,  since  6 ( x ) = 6(xs)V*  on  P,  6 ( x * ) <_  fi(xs)  as  claimed  by  the 


lemma. 
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It  is  now  shown  that  A*  is  admissible  if  ; is  a lower  bound  on  c. 

Theorem  3.1.  If  ¥xr.(x)  <_  c(x)  and  if  all  arc  costs  are  greater 
than  some  small  number  t,  then  A*  is  admissible. 

Pi  oof  Proof  ; by  assuming  the  contrary,  namely  that  A*  does  not 
always  terminate  by  finding  an  optimal  network  traversal.  Three  cases 
must  be  considered  to  complete  the  proof. 

Case  1 : Termination  without  finding  the  destination.  Because  of 
step  4 in  the  ordered  search  algorithm,  termination  cannot  occur  without 
finding  the  destination. 

Case  2:  No  termination.  Let  there  exist  x^,  the  destination  node 
wnich  is  obtainable  from  xt  in  finite  steps  with  minimum  cost  f(xs). 

Arc  cost  > t “=>  there  exists  f(x)  >_  £(x)  >_  £(x)  > Wt  = X(xs)-V-x  farther 
than  W - s(xs)/t  steps  from  x . By  Lemma  3.2,  there  exists  x'  on 
P a 6 ( x ' ) ff(x  ) ± S(x)  and  no  x further  than  W steps  from  xs  is  ever 

evaluated.  Therefore,  by  step  3 of  the  ordered  search  algorithm.  A* 
will  select  x'  for  transmission  instead  of  x.  The  failure  of  A*  to 
terminate  can  thus  only  be  a result  of  continued  reopening  of  nodes 
within  W steps  of  x,  . Let  there  exist  a set  of  u(W)  nodes,  called 
(W),  within  W steps  of  x Assume  there  exists  a finite  number  of  paths 
from  / to  x through  nodes  within  W steps  of  x^  1 Accordingly,  x e \(W) 
can  only  be  opened  at  most  (x,W)  ‘ " times  so  let 

w(W)  - MAX  :.(x,W)  (3.24) 

xeA (W) 

represent  the  maximum  of  times  any  one  node  can  be  reopened.  Hence,  all 

This  assumption  is  valid  because  of  the  detect  and  suppress 
mechanism  described  in  Chapter  IV  to  inhibit  looping. 


L 


x e x ( W ) must  forever  be  closed  after  u(W)u(W)  evaluations.  Since  no 
x t x(W)  can  be  evaluated.  A*  must  terminate. 
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Case  3:  Termination  at  a destination  wi t.hout  achieving  minimal 
cost.  Assume  A*  terminates  at  a destination  x^  with  s(x^)  = fJx^)  > s(xs). 
However  Lenina  3.2  shows  that  just  before  termination  there  exists  an 
open  node  x‘  on  P with  <s ( x ’ ) <_  -s ( ) < 6 ( x^ ) . Therefore,  x‘  would  have 
been  selected  for  transmission  rather  than  x^,  contradicting  the 
assumption  that  A*  terminated. 

Using  these  results,  the  next  section  shows  that  a reasonable 
limitation  on  the  function  r,(x)  produces  an  optimal  A*. 

3.4.2. 3 The  Optimality  of  A*  (66) 

Consider  the  following  restrictions: 

1.  When  A*  selects  x for  transmission,  it  has  already  found  an 
optimal  path  to  x such  that  r(x)  = c(x). 

2.  When  A*  selects  a node  x,  6(x)  <_  <s(xs).  This  is  true  because 
of  dynamic  changes  in  6(x$)  as  the  network  goes  from  state  to  state  and 
because  of  the  influential  power  of  c(x). 

3 Assume  that  the  difference  between  the  estimated  costs  to  a 

destination  from  any  pair  of  nodes  x.  and  x.  is  a lower  bound  on  the 

3 

true  cost  of  an  optimal  path  from  x^  and  x ^ , that  is,  ^(x..)  - t(xj) 

< n(x.,Xj).  According  to  Nilsson,  this  assumption  will  not  be  violated 
if  the  information  used  to  calculate  r,  is  applied  consistently.  He 
appropriately  refers  to  it  as  the  consi stency  assumption. 

Lemma  3.3.  Given  the  consistency  assumption  and  that  x is  closed 
by  A*  ->  r(x)  = ?(x). 

Proof:  Consider  the  sequence  of  nodes  selected  by  A*  just  before 
closing  x and  assume  c(x)  > c(x).  Now,  there  exists  a P from  x?  to  x. 
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Because  c(x)  > s(x),  A*  did  not  find  P. 
an  open  node  x'  on  P with  c(x')  = £(x'). 
but , i f not, 

c(x)  = £(x' ) + n(x* ,x) 


But  by  Lemma  3.2,  there  exists 
The  lemma  is  proved  if  x1  = x 

(3.25) 


= c(x'  ) + n(x'  ,x). 

Assuming  that  c(x)  > f;(x)  «=>  £(x)  > f;(x')  + n(x',x).  (3.26) 

Adding  f;(x)  to  both  sides  produces 

C(x)  + c(x)  > 4(x 1 ) + n(x',x)  + s(x)  (3.27) 

which,  when  applying  the  consistency  assumDtion,  yields 

s(x)  + ^( x ) > c(x ‘ ) + c(x').  (3.28) 

This  is  equivalent  to  s(x)  > s(x')  which  contradicts  the  fact  that  A* 
selected  x for  evaluation  when  x'  was  available.  The  proof  is  now 
complete. 

Lenina  3.4.  If  s is  the  lower  bound  on  <s,  then  V-x  closed  by  A*, 

<s(x)  fi(xs). 

Proof:  Referring  to  Lemma  3.2,  let  x be  closed  by  A*.  Clearly, 
if  x is  the  destination,  then  <s ( x ) = <s(xs).  Assume  that  x is  not  the 
destination.  Since  A*  closed  x before  termination,  there  exists  an 
x'  on  P from  x$  to  x^  with  ^ ( x ' ) £ <s ( xs ) . The  proof  is  finished  if 
x = x ’ ; otherwise  x was  chosen  for  transmission  in  lieu  of  x1  so  it  must 
be  that 


6 (x)  <_  6 (x ' ) £ 6(x$).  (3.29) 

The  optimality  of  A*  can  now  be  proved. 

Theorem  3.2.  Let  there  exist  two  admissible  algorithms,  A and  A*, 

3 A*  is  more  informed  than  A.  Also,  let  the  consistency  assumption  be 
satisfied  by  ? in  A*.  If  x1  was  selected  by  A in  lieu  of  x which  was 
selected  by  A*,  then  <s(x)  < 6 ( x * ) . 
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Proof:  Assume  the  contrary  9 6(x)  >_  <s (x  1 ) . Now 

6 (x • ) = c(x')  + c(x')  ->  c(x')  = 6(x‘)  - c(x').  (3.30) 

A must  have  known  that  fi(x)  >_  fi(xs)  and  therefore  it  knows  that 


c(x)  > 6(x$)  - c(x).  (3.31) 

This  equation  has  a lower  bound  estimate  of 

dx)  = 6(x$)  - dx).  (3.32) 

However,  A*  used  the  function 

6 (x)  = c(x)  + c(x)  (3.33) 

in  selecting  x.  From  Lemma  3.4,  dx)  £ 6(x$)a 

dx)  + dx)  £ dxs).  (3.34) 

Therefore,  whatever  ? used  by  A*,  it  must  have  satisfied  the  inequality 
dx)  5 6(x$)  - dx) . (3.35) 

By  Lemma  3.3,  dx)  = dx)  when  A*  selected  x.  Thus 

dx)  < dx$)  - d*).  (3.36) 


However,  in  selecting  x1 , A used  information  about  x which  permitted  a 

lower  bound  on  6 ( x ) at  least  as  large  as  the  lower  bound  used  by  A*. 

Thus  A*  must  not  have  been  more  informed  than  A,  contradicting  the 

assumption  and  proving  the  theorem. 

3.4.3  Optimum  Routing  as  a Function  of  A*  and  B* 

n 

B*  uses  c.  , 1 < v < i , as  defined  in  section  3.3  to  determine  the 
'v 

minimum  cost  path  in  tree  t. . The  cost,  c.  , however,  is  used  only  to 

1 \ 

select  the  node  j adjacent  to  node  i which  reflects  the  path  of  least 
resistence  to  the  destination.  Its  value  also  includes  the  cost  of 
i - 1 levels  beyond  node  j plus  a cost  factor  which  is  a function  of 
the  coordinates  of  the  vertex  at  the  end  of  the  path  and  those  at  the 
destination.  Because  of  the  varying  load  conditions  in  a network,  the 
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path  used  in  deriving  the  minimum  c.  may  in  fact  not  be  traversed 

1 v 

beyond  the  adjacent  node  j. 

Proposition  3.1.  An  optimum  path  through  a network  may  be  described 
as  a function  of  A*  and  B*. 

Discussion:  First,  c^  must  be  decomposed  so  that  elements  used 

v 

in  its  derivation  are  shown  to  be  representative  of  the  results  from 

applying  A*.  Let  c‘.  represent  that  cost  portion  of  c.  between  node  i 

v 1 v 

and  the  adjacent  node  j along  path  v.  Then  the  actual  cost,  c(i),  from 

the  start  node  x$  to  the  current  node  selected  for  transmission  can 
m 

be  represented  by  £ c^  where  v represents  an  association  factor  between 
w=l  v 

the  path  selected  using  B*  and  the  path,  P,  derived  using  A*.  Let  P*  be 
that  portion  of  the  optimum  path  P already  traversed  such  that  m is  the 
number  of  nodes  along  P'.  An  estimate  for  the  remaining  cost  of  P is 
represented  by 

di)  - j,  [gUlf^y $,,]  * c(ov).  (3.37) 


Note  that  g and  f here  are  summed  from  1 to  i instead  of  0 to  i as  in 
equation  3.14.  This  is  because  the  cost  for  k = 0 is  included  in  ^(i) 
as  the  adjacent  node  selected  for  transmission.  The  evaluation  function 
used  by  A*  is  determined  from  equation  3.22  to  be 

*(i)  = JK  * j,  + e(ev>-  (3-38) 


Briefly,  it  should  now  be  apparent  that  the  use  of  B*  is  embedded  in  A* 
whose  logic  and  proof  of  optimality,  extracted  from  Nilsson,  is  used  to 
derive  optimal  routing  from  point  of  generation  to  destination. 
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3.5  Summary 

In  opening  paragraphs,  terminology  and  an  example  are  presented  as 
a basis  for  subsequent  analytical  presentations.  Development  of  the 
analytical  model  is  made  with  two  arguments,  one  using  the  classical 
M/M/C  gueuing  system  (details  contained  in  Appendix  A)  and  the  other 
using  decomposition-synthesis  with  the  independence  assumption  as  postu- 
lated by  Kleinrock  A network  cost  function  is  then  used  to  derive  the 
probability  that  a message  is  transmitted  from  node  i to  node  j.  These 
developments  provide  the  foundation  for  final  arguments  which  propose 
an  algorithm.  A*,  for  determination  of  an  optimal  path  through  a network 
This  algorithm  is  used  as  a basis  for  the  design  of  the  simulation  model 
described  in  the  following  chapter. 
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CHAPTER  IV 

THE  NETWORK  SIMULATION  MODEL 

4.  Introduction 

One  of  the  main  reasons  for  using  simulation  is  that  many  systems 
defy  standard  mathematical  or  operations  research  techniques  for  their 
solution.  This  may  originate  from  inherent  and  unavoidable  processes 
in  the  system  to  be  evaluated.  It  may  also  be  because  of  the  sheer 
mathematical  intractability  of  the  equations  which  represent  the  system 
(1).  Both  of  the  problems  arise  in  the  evaluation  of  networks  which 
possess  schemes  designed  to  adjust  routes  in  response  to  varying  load 
conditions.  In  any  case,  it  may  be  desirable  to  use  simulation  to 
enforce  results  obtained  by  other  means. 

Simulation  supporting  the  research  of  this  dissertation  was  used  as 
a vehicle  to  evaluate  several  closely  related  routing  algorithms  for 
large  scale  networks.  The  simulation  program  is  based  on  the  abstract 
model  introduced  in  Chapter  III  and  includes  the  assumptions  made  in  the 
model's  specification.  It  is  designed  so  that  additional  routing 
procedures,  as  developed,  may  be  simulated  with  minimum  adjustments. 

GASP-IV  was  chosen  as  the  simulation  language  because  of  its 
simplicity,  because  it  is  FORTRAN-based , and  because  it  is  flexible  and 
can  be  easily  modified  (68).  Modifications  of  GASP-IV  became  necessarv 
to  extend  the  simulation  effort  beyond  19  nodes.  The  required  modifi- 
cations were  needed  because  of  increases  in  table  dimensions  and  output 
fo  rma  t . 
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The  network  simulation  program  is  stripped  of  all  but  the  most 
essential  data  collection  statements.  This  was  necessary  because  of 
the  exponential  increase  in  computer  time  required  to  simulate  increas- 
ingly large  networks  (Figure  9).  Particular  emphasis  was  given  to 
program  design  so  that  calculations  and  subroutine  calls  were  minimized. 
These  design  considerations  combined  with  the  use  of  FORTRAN  H (Opt-2) 
reduced  overall  execution  time  by  approximately  30-35%. 

Though  the  simulation  program  was  written  for  execution  on  the 
Amdahl  470v/6,  only  the  subroutine  for  generating  random  numbers  need 
be  rewritten  to  be  portable  to  other  type  machines.  The  program 
requires  730,000  bytes  of  storage  for  networks  of  16  nodes  with  150 
buffers  each,  to  2,000,000  bytes  for  networks  of  1024  nodes  with  25 
buffers  each.  It  consists  of  approximately  1500  executable  FORTRAN 
statements  comprising  14  routines  exclusive  of  GASP  modules.  From 
125,000  to  1,400,000  bytes  of  memory  are  devoted  to  event  and  entity 
queues  and  roughly  80,000  bytes  are  required  for  network  descriptor 
tables. 

The  general  system  flow  of  the  simulation  program  is  shown  in 
Figure  10.  Appendix  C contains  a user's  guide  and  a listing  of  the 
program.  GASP-IV  and  associated  modifications  can  be  obtained  from 
the  Industrial  Engineering  Department  of  Texas  A&M  University. 

Entities  within  the  simulator  are  messages  which  are  routed  from 
node  to  node  by  the  particular  algorithm  being  simulated.  In  order  of 
priority,  the  three  classes  of  messages  recognized  are:  acknowledgement, 
update  vectors  for  adaptive  routing,  and  regular  (user)  messages.  Each 
message  in  the  simulator  is  described  in  a ten  word  array  referred  to  as 
the  messsage  attributes.  The  second,  third,  and  fourth  entries  in 


rison  of  execution  times  for  search  depth  2 


Figure  10.  General  system  flow 
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GASP  IV  Responsibilities 


A 
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Figure  11  are  graphic  presentations  of  the  array  structures  for  the 
different  classes  of  messages. 

The  top  entry  in  Figure  11  is  an  event  generation  entity.  It  is 
used  to  identify  to  GASP  the  type  of  action  currently  being  simulated, 
i.e.,  message  generation,  message  transmission,  acknowledgement,  etc. 

Event  messages  are  filed  by  priority  according  to  designated  time  of 
occurrence  as  identified  by  the  first  word  in  the  message.  The  simula- 
tor accumulates  data  by  tabulating  statistics  on  each  event  as  it  occurs. 

4.1  Properties  of  the  Simulator 

The  network  simulator  is  currently  limited  to  networks  of  1024  nodes 
or  less.  This  constraint  is  imposed  by  the  existing  configuration  of 
GASP-IV  as  modified  in  support  of  this  research.  Further  increases  are 
possible  with  additional  modifications  to  GASP-IV  and  corresponding 
changes  to  the  simulator. 

Networks  are  generated  internal  to  the  user-provided  main  driver 
routine.  The  number  of  nodes,  N,  in  the  network  to  be  simulated  is 
specified  through  input  data,  although  restricted  to  values  of  N = it2  , 

2 < u < 32.  Specifying  the  size  in  this  manner,  these  constraints 
provide  for  a programmed  generation  of  networks  possessing  well  dis- 
tributed properties. 

Each  network  simulated  is  characterized  as  follows.  As  in  the 
abstract  model,  messages  are  assumed  to  arrive  in  a Poisson  pattern. 

They  are  generated  by  and  delivered  to  host  computers  that  are  attached 
to  the  network  via  256  kilobit  links.  Each  node  in  the  network  is 
considered  to  service  only  one  host  computer.  Message  queues  at  the 
hosts  are  regarded  as  FIFO  and  essentially  infinite.  Nodal  queues  are 


GASP  file  entity  structure  (exclusive  of  file  pointers) 
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ordered  FIFO  by  message  class  and  are  finite  in  length.  The  combined 
length  of  the  service  and  transmission  queues  cannot  exceed  a node 
buffer  length  specified  in  the  input  data.  Hosts  are  prohibited  from 
sending  their  servicing  nodes  a message  whenever  the  corresponding 
node  queues  are  determined  to  be  one-half  full.  This  ensures  that  a 
node  does  not  become  saturated  due  to  excessive  message  generation  by 
its  host. 

The  data  speed  of  interconnecting  links  between  nodes  is  set  to 
be  constant  at  9.6  kilobits  per  second.  This  speed  was  selected 
because  of  the  ability  of  some  modems  to  operate  efficiently  at  9600 
bps  with  minor  conditioning  of  voice  grade  circuits.  However,  with 
modest  reprogramming,  the  interconnecting  lines  can  be  set  to  any  other 
data  rate  or  each  link  can  be  set  to  its  own  unique  speed.  Though 

_5 

links  are  not  allowed  to  completely  fail,  they  are  fixed  with  a 10 
bit  error  rate. 

Nodes  are  regarded  as  100  percent  reliable  while  nodal  processing 

_3 

delays  are  set  constant  at  10  seconds  per  message.  Line  proDagation 
is  set  at  eight  microseconds  per  mile  with  an  additional  delay  of  70 
microseconds  to  adjust  for  modem  processing. 

The  simulator  is  designed  for  nodes  to  control  buffer  storage, 
provide  acknowledgements  for  incoming  traffic,  test  for  transmission 
errors,  and  perform  message  routing.  No  CPU  intervention  is  required 
nee  a message  is  placed  in  the  transmission  queue;  that  is,  inter- 
ting  links  are  allowed  to  run  free  at  their  maximum  data  rate, 
•ent  with  third  generation  computer  systems  which  use 
jt/output  service  without  CPU  intervention. 


f 
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Input  data  to  the  network  is  initialized  at  a load  factor,  RE,  of 

0.2  for  most  simulation  runs.  Stated  in  queuing  terms,  a load  factor 
of  RE  = 1.0  implies  that  the  arrival  rate  for  a node  is  approximately 
equivalent  to  the  maximum  capacity  of  interconnecting  links  exiting 
that  node,  that  is. 


RE 


A . 
J 


(4.1) 


where  A.  , A.,  and  n.  are  as  described  in  section  3.1.  Both  the 
' j J J 

initial  load  factor  and  load  factor  increment  are  provided  as  input 
parameters  to  the  simulator. 

The  following  list  is  a general  description  of  simulator  input 
provided  by  the  user: 

1.  The  number  of  nodes  to  be  simulated. 

2.  User  message  length. 

3.  An  initial  network  loading  factor. 

4.  The  load  factor  increment  value. 

5.  The  delay  vector  update  interval  in  seconds. 

6.  The  maximum  search  depth  level. 

7.  The  maximum  queue  length  per  node. 

8.  The  maximum  number  of  messages  to  be  supported  at  any  given 
time  by  the  particular  network  being  simulated. 

9.  A bias  term  for  applying  the  proper  weight  to  the  function  r(o) 
as  described  in  section  3.3. 


10.  The  simulator  termination  criteria  defined  as  the  maximum 
number  of  messages  to  be  simulated. 
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11.  A print/no- print  switch  value  for  listing  the  network  coor- 
dinates and  associated  descriptive  matrices. 

12.  A value  which,  when  combined  with  the  output  of  a random 
number  generator,  expresses  traffic  load  as  a function  of 
circuit  speed.  The  result  is  used  to  derive  the  traffic  mean 
interarrival  time  for  each  node.  It  is  equivalent  to  the 
inverse  of  the  arrival  rate,  i.e.,  1/A.  for  node  j where  A. 

J J 

is  as  described  in  section  3.1. 

The  input  to  the  simulator  is  used  to  build  various  topological 
and  geographical  descriptors  of  the  network.  One  of  these  descriptors, 
the  delay  table,  is  particularly  important  to  the  dynamic  feature  of 
the  routing  algorithm.  The  delay  table  is  initialized  with  Dropagation 
delays  and  later  adjusted  to  include  delays  resulting  from  increasing 
queue  lengths  at  each  connecting  node. 

Periodically,  each  node  in  the  network  will  generate  a minimum 
delay  vector  for  adjacent  nodes.  The  delay  vector  contains  the  delay 
to  all  other  nodes  at  the  specified  search  depth  adjusted  by  internal 
delays  from  the  generating  node.  In  this  manner,  each  node  is  able  to 
update  its  delay  table  with  current  delay  information.  The  table  only 
contains  delay  information  to  other  nodes  reachable  within  the  specified 
search  depth  level. 

The  optimum  update  interval  was  determined  to  be  five  seconds  during 
the  tuning  phase  of  the  simulator  (Figure  12).  With  exception  of  the 
256  node  simulations,  all  simulations  were  made  with  the  five  second 
minimum  delay  update  interval.  The  update  interval  was  relaxed  to  ten 
seconds  for  the  256  node  simulation  to  allow  more  execution  time  for 
message  routing. 


Results  of  update  interval 


(in  secs) 
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A node  interrogates  the  delay  table  upon  receiving  a message  for 
which  an  outgoing  link  must  be  determined.  The  results  of  the  interroga- 
tion are  then  modified  according  to  the  particular  algorithm  that  is 
simulated.  Finally,  the  outgoing  link  whose  cost  is  determined  to  be 
minimum  is  selected  for  transmission.  Each  subsequent  node  that  receives 
the  message  performs  this  evaluation  until  the  destination  node  is  reached. 
In  general,  the  routing  policy  consists  of  identifying  the  adjacent  node 
which  corresponds  to  the  minimum  delay  route  to  the  destination. 

4.2  General  Program  Flow 

The  user  provided  driver  program  builds  the  simulated  network  by 
generating  rectangular  coordinates,  each  coordinate  a function  of  the 
previously  generated  set.  The  initial  coordinates  (node  1)  are  set  to 
(x,  y)  = (0,  0)  where  x and  y correspond  to  values  along  the  horizontal 
and  vertical  axes  of  the  rectangular  plane. 

The  maximum  numerical  value  of  the  coordinates  could  have  an  impact 
on  the  number  of  overhead  bits  required  in  each  message.  The  following 
procedure  will  ensure  that  the  coordinate  values  are  minimized. 

1.  Rotate  the  rectangular  plane  such  that  all  points  along  a 
linear  regression  line  representing  the  nodes  of  the  network  are 
equidistant  from  the  x and  y axes.  Rotation  is  achieved  by 

x1  = x cos  <(>  - y sin  $ 
y'  = x sin  $ - y cos 

where  <)>  is  the  angle  between  the  linear  regression  line  and  the 
hypotenuse  of  a right  isosceles  triangle  with  the  x and  y axes  forming 


the  legs.  Rotation  is  probably  unjustified  for  networks  whose  nodes 
are  poorly  correlated. 
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2.  Collapse  the  plane  onto  the  network  by  subtracting  the  value 
of  x‘  for  the  node  closest  to  the  y'  axis  (called  x^)  from  all  x1  in 
the  network.  Correspondingly,  subtract  the  value  of  y1  for  the  node 
closest  to  the  x'  axis  (called  y^)  from  all  y'  in  the  network.  The 
newly  derived  coordinates  are  represented  by 

(x\  y")  = (x!  - x',  yl  - y')  , 1 < i i N , (4.2) 

here  N is  the  number  of  nodes  in  the  network. 

The  driver  program  also  generates  topological  parameters 
while  building  the  network.  These  parameters  are  as  follows: 

1.  An  adjacency  matrix.  A,  whose  elements,  a..,  are  constrained 

* J 

by  ai]  = i , 1 £ i 1 N and  a^  = k,  a^.  f i,  1 £ i 1 N,  2 <_  j <_  ni , 
and  1 k <_  N,  where  n^ , as  described  in  section  3.1,  represents  the 
number  of  directed  links  emanating  from  vertex  i.  For  a given  i,  a 
directed  edge,  e^ , is  indicated  between  vertex  i and  vertex  a.., 

2 < j <_  n. . However,  since  all  links  are  full-duplex,  a directed  edge 
between  i and  a. . implies  there  exists  a corresnonding  directed  edge 
between  vertex  j and  vertex  a..,  2 £ i <_  n.. 

‘ J 

2.  A distance  matrix,  D,  whose  elements  d.j,  represent  the 
distance  between  node  i and  node  a^,  2 <_  j n..  The  distance  is 
determined  by  applying  Pythagoras'  theorem  to  the  nodes  for  which 
connectivity  is  indicated. 


3.  A capacity  matrix,  C,  whose  elements,  c^,  represent  the 
capacity  of  the  link  connecting  i to  a^.  Although  all  interconnecting 
circuits  are  set  to  9.6  kilobits  per  second  for  this  research,  minor 
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reprogramming  would  provide  links  designated  for  different  capacities, 
each  unique  if  necessary. 

4.  A traffic  matrix,  R,  whose  elements,  r^ , specify  the  expected 
traffic  load  from  node  i to  node  a...  These  values  are  used  to  derive 

* J 

the  mean  interarrival  times  (see  section  3.1). 

Furthermore,  the  driver  routine  assigns  link  numbers  and  builds 
source  and  destination  vectors  that  are  cross-referenced  with  the  link 
numbers.  These  vectors  are  valuable  in  reducing  the  amount  of  execu- 
tion time  necessary  for  acknowledging  receipt  of  a packet.  Control  is 
turned  over  to  GASP  when  these  functions  are  complete. 

GASP  begins  the  simulation  process  by  initializing  internal 
variables  and  then  calling  a user  routine  INTLC  to  initialize  user 
variables  (e.g.,  the  first  message  and  the  first  delay  update  events 
for  all  nodes).  INTLC  is  called  at  the  start  of  each  simulation  and 
therefore  provides  for  re-initializing  parameters  when  multiple  simula- 
tions are  desired  (68).  After  resetting  user  variables,  control  is 
passed  back  to  GASP  for  the  event  simulation. 

Event  routines  are  required  for  message  generation,  nodal  pro- 
cessing and  routing,  and  line  transmission.  The  following  statistics 
are  output  at  the  conclusion  of  a simulation: 

1.  Rated  system  load. 

2.  Observed  system  load. 

3.  Last  generated  message. 

4.  Current  simulation  time. 

5.  Number  of  buffers  per  node. 

6.  Delay  vector  update  interval. 

7.  Distance  bias  term. 


8.  Rejections  due  to  busy  link. 

9.  Rejections  due  to  full  buffers. 

10.  Rejections  due  to  busy  CPU. 
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GASP  also  computes  the  mean,  standard  deviation,  minimum  and  maximum 
values,  and  number  of  observations  for  the  following  GASP  input  initial- 
ization parameters: 

1.  Average  message  generation  times. 

2.  Link  busy  time. 

3.  Average  hops  per  message. 

4.  Average  message  delay. 

5.  Average  queue  length. 

The  observed  system  performance  is  compared  to  theoretical  results 
in  Figure  13.  The  simulation  results  were  obtained  using  a 36  node 
network  at  indicated  load  factors.  The  theoretical  curve,  derived  using 
equation  3.12,  requires  that  message  traces  and  individual  nodal  mean 
arrival  rates  be  maintained  during  the  simulation  run.  This  results  in 
exceptionally  large  storage  and  simulator  execution  requirements.  As  a 
consequence,  an  alternative  method  was  utilized.  Kleinrock's  results 
(31)  for  a 19  node  network  are  translated  to  a 36  node  network  with  the 
equation  Y'  = 5.45Y  where  Y is  the  mean  message  delay  for  the  19  node 
network  and  Y1  is  the  corresponding  translated  result  for  the  36  node 
network.  The  simulation  results  for  RE  = 0.6  are  used  as  a pivot  point. 
A comparison  of  the  two  curves  indicates  that  the  results  of  the  simula- 
tion have  a close  correlation  with  the  translated  theoretical  results. 
This  comparison  is  valid  only  if  a linear  transformation  of  average 
message  delay  is  assumed  between  the  two  network  sizes. 
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Figure  13.  Simulation  performance 


simulated  results 

translated  theoret- 
ical curve 


4.3  Description  of  Experimental  Models 

Each  of  the  curves  in  Figure  9 represent  execution  times 
required  for  delivering  the  number  of  indicated  messages.  An 
equation  representing  each  of  the  curves  can  be  obtained  by  using 
a linear  combination  of  the  well  known  Lagrange  polynomials: 

(4.3) 


(4.4) 


x - x. 


Pj(X)  .nn  lx.  - X. 
J 1=0  j i 

The  linear  combination  is  defined  by 

n 


, 1 < j < n. 


Pn(x)  = l f(x.)P.(x) 
n j=0  J J 


which  subsequently  produces 
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p?(x)  = . 37x?  - 245. 03x  + 144.82  (4.5) 

using  the  three  points  given  for  the  50  messages  delivered  per  node. 

This  equation  can  be  used  to  project  approximate  execution  times  for 
different  size  networks  as  long  as  all  other  parameters  remain  fixed. 

The  prediction  for  a simulation  of  a network  of  1024  nodes  at  load 
factors  0.2  through  1.0  with  an  average  of  50  messages  delivered  per 
node  was  28.4  hours.  The  results  in  Figure  9 indicate  (suggest) 
that  execution  times  increase  at  an  explosive  rate  for  increases 
in  the  number  of  nodes  per  network.  It  should  be  apparent  that  such 
an  effort  is  economically  impractical.  Therefore,  a systematic 
six  step  procedure  was  developed  using  three  smaller,  different 
size  networks  as  an  inductive  argument  for  expected  results  with  much 
larger  networks.  The  final  step  in  the  procedure  consists  of  a sequence 
of  simulations  r.t  a 256  node  network,  varying  load  factors  between  0.2 
and  1.0  at  02  increments. 

Using  an  ALGOL-like  syntax,  the  entire  procedure  can  be  summarized 


as  follows: 


STEP J : BEGIN 

ALGORITHM  = 1; 

FOR  N = 4 STEP  1 UNTIL  6 DO; 

♦ESTABLISH  THE  NETWORK  SIZE* 

NS  = N**2; 

BEGIN 

♦EVALUATE  EACH  SEARCH  DEPTH* 

FOR  SD  = 1 STEP  1 UNTIL  4 DO; 

BEGIN 

♦EVALUATE  NETWORK  PERFORMANCE  AT  INTERVALS* 
♦OF  MESSAGES  DELIVERED* 

FOR  MPN  = 50  STEP  50  UNTIL  200  DO; 

BEGIN 

♦EVALUATE  LOAD  FACTORS  0.2  THROUGH  1.0* 

FOR  RE  = 0.2  STEP  0.2  UNTIL  1.0  DO; 
simulate  algorithm  1 with  network  size  = 
NS,  search  depth  = SD,  messages  delivered 
per  node  = MPN,  and  load  factor  = RE; 

END 

END 

END 

END; 

STEP_2:  Analyze  data  for  optimum  search  depth. 

STEP_3 : BEGIN 

♦SET  NETWORK  SIZE* 

NS  = 36; 

♦ESTABLISH  SIMULATION  CUTOFF  PARAMETER  OF  50* 
♦MESSAGES  PER  NODE* 

MPS  = 50; 

♦SELECT  THE  BEST  SEARCH  DEPTH* 

SD  = optimum; 

♦EVALUATE  EACH  ALGORITHM* 

FOR  ALGORITHM  = 2 STEP  1 UNTIL  3 DO; 

BEGIN 

♦EVALUATE  LOAD  FACTORS  0.6  THROUGH  1.0* 

FOR  RE  = 0.6  STEP  0.2  UNTIL  1.0  DO; 

simulate  algorithm  X with  network  size  = 36, 
search  depth  = SD,  messages  delivered  per 
node  = 50,  and  load  factor  = RE; 

END 

END; 

STEP_4:  Analyze  data  for  optimum  algorithm. 


80 


STEP  5:  BEGIN 

♦SELECT  THE  BEST  ALGORITHM* 

ALGORITHM  = optimum; 

♦ESTABLISH  NETWORK  SIZE* 

NS  = 36; 

SD  = optimum; 

BEGIN 

FOR  MPN  = 50  STEP  50  UNTIL  200  DO; 

BEGIN 

FOR  RE  = 0.2  STEP  0.2  UNTIL  1.0  DO; 
establish  confidence  intervals  at  load  factor 
= RE  using  a 36  node  network  with  the  optimum 
search  depth  and  algorithm  delivering  50,  100, 

150,  and  200  messages  per  node; 

END 

END 

END; 

STEP_6:  BEGIN 

♦SELECT  THE  BEST  ALGORITHM* 

ALGORITHM  = optimum; 

NS  = 256; 

♦SELECT  THE  OPTIMUM  SEARCH  DEPTH* 

SD  = optimum; 

♦EVALUATE  NETWORK  PERFORMANCE  AT  INTERVALS  OF* 

♦MESSAGES  DELIVERED* 

BEGIN 

FOR  MPN  = 25  STEP  25  UNTIL  100  DO; 

♦EVALUATE  LOAD  FACTORS  0.2  THROUGH  1.0* 

BEGIN 

FOR  RE  ••  0.2  STEP  0.2  UNTIL  1.0  DO; 
simulate  a network  of  256  nodes  delivering  25, 

50,  75,  and  100  messages  per  node  using  the  optimum 
algorithm  and  search  depth; 

END 

END 

END; 


Three  algorithms  are  indicated  in  the  above  procedure  (steps  1 
and  2).  Each  algorithm  is  identical  except  for  the  method  of  deriving 
the  total  delay  along  a path,  v.  For  each  algorithm,  a different 
weighting  function,  g(4>),  was  employed  as  described  by  equation  3.1  A. 
The  basic  form  of  the  three  algorithms  is  best  conveyed  by  the  syntax 
below.  Phi  takes  on  the  values  of  1 , 2,  and  e for  algorithms  1,  2, 
and  3 respectively. 
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BEGIN 

♦ESTABLISH  A WEIGHT  VARIABLE  FOR  FINE  TUNING* 

WT  = 1 .O/PHI; 

♦SELECT  THE  APPROPRIATE  SEARCH  DEPTH* 

LEVEL  = SEARCH_DEPTH; 

♦EVALUATE  THE  PATHS  IN  THE  TREE* 

FOR  PATH  = 1 STEP  1 UNTIL  TOTAL_PATHS  DO; 

♦RETRIEVE  THE  NODE  TO  BE  EVALUATED* 

TEMP_NODE  = VERTEX (LEV EL, PATH); 

♦EVALUATE  THE  COORDINATES  OF  THE  RETRIEVED  NODE* 
AJ/ALUE  = ABSOLUTE  JDF ( X_COORD I NATEJDF ( TEMP  JIODE ) 
-X_COORDINATE_OF(DESTINATION_NODE) ) ; 
BJ/ALUE  = ABSOLUTE_OF(Y_COORDINATE_OF(TEMP_NODE) 

- Y_COORD I NATE _OF ( DEST I NAT I ON_NODE ) ) ; 
STMT_8 : WEIGHTED_DI STANCE  = FACTOR  * (A_VALUE  + BJ/ALUE) 

+ PIVOT  POINT; 

BEGIN 

♦TOTAL  TREE  DELAY  IN  THE  DELAY  TABLE* 

TOTAL  DELAY  = O.O; 

FOR  LL  = 1 STEP  1 UNTIL  LEVEL  DO; 

TOTAL  JDELAY  = T0TALJ3ELAY  + WT  * DELAY  MATRIX 
(LL,PATH) ; 

END 

♦ACCUMULATE  THE  TOTAL  DELAY  IN  A PATH  * 

DELAY J/ ECTOR (PATH)  = BIAS  * WEIGHTED_DI STANCE  + 
TOTAL_DELAY; 

END; 

BEGIN 

♦INITIALIZE  FOR  COMPARISON  OF  COSTS* 

MIN  COST  = DELAY  J/ ECTOR (1 ) ; 

♦SELECT  THE  FIRST  PATH  FOR  MINIMUM  COST  EVALUATION* 
PATH  = 1; 

♦COMPARE  THE  COSTS  OF  THE  REMAINING  PATHS* 

FOR  PTR  = 2 STEP  1 UNTIL  TOTAL_PATHS  DO; 

IF  MIN  COST  GREATER  THAN  DELAYJZECTOR(PTR)  THEN 
MIN  COST  = DELAY J/ECTOR, PTR) ; 

PATH  = PTR; 

ELSE; 

LINK  = L I NK_V  ECTOR (PATH); 

NEXT_NODE  = ADJACENCY_MATRI X ( CURRENT  JiODE ,L I NK) ; 

END; 


The  variable  WEIGHTED_DISTANCE,  in  statement  8,  requires  further 
explanation.  Clearly,  increased  distances  between  TEMP_NODE  and 
DESTINATION_NODE  will  result  in  a greater  value  for  AJ/ALUE  + BJ/ALUE. 
If  WEIGHTED_DISTANCE  was  set  equal  to  this  sum,  then  for  greater 
distances,  the  algorithm  would  lose  its  dynamic  routing  flexibility 
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because  of  the  totally  dominant  distance  factor.  Not  only  does  routing 
then  become  fixed,  but  some  nodes  in  poorly  developed  distributed 
networks  quickly  become  saturated,  resulting  in  messages  becoming 
trapped  by  loops.  This  looping  phenomena  is  given  extensive  analysis 
by  Neblock  (65)  in  his  research  on  loop-free  routing  algorithms. 

Consider  the  following  example  to  demonstrate  the  use  of  the 
weighted  distance  function.  Assume  a search  depth  of  1,  and  that  a 
message  generated  by  node  1 (Figure  14)  is  destined  for  node  7.  If 
the  dominant  routing  factor  is  always  distance,  then  the  message  would 
quickly  assume  a routing  pattern  of  1 -3-6-2-3-6-2-3-6- . . . where  a 
message  may  not  be  returned  to  the  node  from  which  it  arrived.  This  is 
because  the  distances  between  nodes  4 and  5 and  node  7 are  greater  than 
those  between  nodes  3 and  7.  Use  of  a function  such  as  that  contained  in 
statement  8 places  a limit  on  this  distance  value,  and  therefore  on  the 
impact  of  the  distance  element. 

The  variable  PIV0T_P0INT  must  be  set  to  a threshold  value  at  which 
point  distance  no  longer  has  dominant  control.  FACTOR  is  determined 
by  the  amount  of  emphasis  each  additional  distance  unit  should  have. 

The  specific  values  used  for  PIV0T_P0INT  and  FACTOR  were  determined 
during  the  tuning  phase  of  the  simulator. 

Step  3 provides  for  the  analysis  of  data  collected  during  the 
simulation  process,  and  will  be  discussed  in  Chapter  V.  The  purpose 
of  step  4 is  to  add  credibility  to  the  results  obtained  from  the  network 
simulation.  The  method  involves  the  use  of  estimators  for  the  average 
queue  length,  average  delay,  and  average  system  utilization.  Confidence 
intervals  are  established  around  these  estimators  to  reflect  a percentage 
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of  certainty  that  simulated  measurements  can  be  expected  to  approach 
the  actual  measurements. 

Considerable  emphasis  was  placed  on  developing  a program  which 
generates  coordinates  for  nodes  of  a well  connected  network.  The 
program  had  to  provide  coordinates  with  some  element  of  randomness, 
establish  connectivity,  and  conform  to  the  criteria  of  a distributed 
network.  Since  the  purpose  of  this  research  was  to  investigate 
algorithms  for  large  networks  which  dynamically  adjust  routes  for 
varying  load  conditions,  care  was  taken  to  avoid  the  use  of  any  network, 
such  as  the  one  in  Figure  14,  whose  design  enhances  looping.  "Detect 
and  suppress"  mechanisms  (32)  were  installed  to  minimize  looping  which 
occurs  at  network  saturation.  Should  a network  possess  a node  that  is 
not  well  connected,  such  as  node  6 in  Figure  14,  one  of  two  possible 
corrective  measures  is  possible: 

1.  Redesign  the  network  for  better  connectivity,  providing  for 
connectivity  between  nodes  6 and  7 in  the  example,  or 

2.  Provide  for  unique  software  at  nodes  which  have  a high 
propensity  for  loops.  In  the  network  of  Figure  14,  nodes  2 and  3 must 
ensure  that  traffic  going  to  nodes  7,  8,  or  9 are  forced  into  nodes  4 
and  5 respectively. 

Standard  software  is  a critical  factor  in  minimizing  development 
costs  in  large  networks.  Therefore,  the  simulator  assumes  good 
connectivity  as  a desirable  attribute  when  generating  the  networks  to 
be  simulated.  The  generated  networks  used  for  testing  the  dynamic 
nature  of  the  proposed  algorithms  are  shown  in  Figures  15,  16,  and  17. 
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CHAPTER  V 

ADAPTIVE  ROUTING  FOR  LARGE  DISTRIBUTED  NETWORKS 

5.  Introduction 

The  most  commonly  used  adaptive  routing  policies  for  distributed 
networks  base  decisions  about  routing  on  delay  information  stored  in 
tables  at  each  node  of  the  network.  These  tables  contain  the  most 
recent  information  available  for  identifying  the  minimum  cost  path 
to  each  destination  accessible  from  a given  node.  They  are  regularly 
updated  using  a combination  of  delays  internal  to  the  node  and  delay 
information  provided  by  neighboring  nodes.  For  large  scale  networks  of 
250  nodes  or  more,  the  routing  information  stored  at  each  node  becomes 
excessively  costly  in  terms  of  nodal  storage  required  to  maintain 
routing  tables  and  execution  time  needed  for  updates. 

A second  problem,  of  no  less  significance,  has  to  do  with  schemes 
for  preventing  looping.  The  formation  of  loops  within  a network  causes 
a looping  message  to  traverse  more  paths  than  necessary  to  reach  its 
destination.  Once  loop  formations  have  occurred,  McQuillan  (59)  has 
shown  that  they  are  self-perpetuating  because  of  the  increasing  and 
fluctuating  loads  placed  on  a network.  The  classical  approach  to 
inhibit  looping  is  to  include  in  the  header  of  each  packet  a bit 
oriented  field,  one  bit  for  every  node  in  the  network.  Each  bit, 
initially  set  to  zero  (off),  is  turned  on  whenever  the  packet  passes 
through  the  corresponding  node.  This  field  is  interrogated  by  each 
node  needing  to  transmit  a message.  Ports  leading  to  nodes  with  an 
"on"  condition  in  the  packet  to  be  transmitted  are  eliminated  from 
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consideration.  A packet  may  become  trapped  when  it  enters  a node 
having  adjacent  nodes  previously  traversed.  The  trapping  phenomena  is 
discussed  later  in  this  chapter. 

Partitioned  routing  schemes  proposed  by  Neblock  (65)  and  hierar- 
chical routing  schemes  proposed  by  Kamoun  (42)  both  provide  for  loop 
free  routing  with  a high  propensity  to  avoid  traps.  Neblock's  parti- 
tioning scheme  provides  for  a significant  savings  in  bandwidth  by 
decreasing  the  number  of  bits  for  tracing  the  path  of  a packet.  For 
example,  a 64  node  network  requires  64  bits  of  trace  overhead  in  global 
node  mapping.  If,  however,  the  network  is  divided  into  eight  regions 
of  eight  nodes  each  as  proposed  by  Neblock,  the  bit-oriented  trace 
field  is  reduced  to  log2  8 + 8 = 11  bits.  The  method  is  simple  and 
imposes  little  packet  overhead  even  for  large  networks.  It  does, 
however,  possess  the  following  disadvantages,  some  of  which  Neblock 
suggested  for  further  research. 

1.  Partitioning  of  an  arbitrary  network  in  such  a manner  that  the 
algorithm  works  equally  well  regardless  of  direction  of  flow  is  a 
complex  and  yet  unsolved  problem. 

2.  Minor  changes  in  network  topology  have  a rather  dramatic  effect 
on  nodal  software.  For  example,  an  increase  of  one  node  may  result  in 
the  loss  of  one  or  two  text  bits  to  the  trace  field  depending  upon  the 
existing  configuration.  Corresponding  changes  in  nodal  software  would 
be  necessary. 

3.  Delay  tables  residing  at  each  node  remain  relatively  large  for 
large  networks  and  fluctuate  in  size  with  changes  in  topology. 

Kamoun  uses  essentially  the  same  concept  used  by  Neblock  with  two 
exceptions.  First,  partitions  (clusters)  are  allowed  to  reside  within 
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superclusters  which  may  reside  in  larger  clusters,  etc.,  until  the 
network  may  be  completely  enclosed  in  a global  cluster.  Secondly, 

Kamoun  emohasizes  the  analytical  approach  to  modeling  network  parameters 
which  may  be  indicative  of  the  complexity  in  simulating  the  scheme. 

The  hierarchical  routing  scheme  provides  a surprising  amount  of 
savings  in  very  large  networks  for  storing  delay  tables  at  each  node. 
This  is  because  there  are  no  restrictions  on  the  number  of  levels  in 
the  hierarchy.  Each  additional  hierarchy  reduces  storage  reguirements 
for  subordinate  clusters.  Adding  hierarchies  is  effective  up  to  some 
optimum  level  at  which  point  storage  requirements  begin  to  increase 
due  to  the  number  of  hierarchies.  An  example  for  the  extreme  case  is 
where  each  node,  excluding  the  leaf,  has  one  and  only  one  subordinate 
node.  The  problem  of  determining  the  optimum  level  is  addressed  in 
Kamoun' s research. 

The  added  advantages  of  the  hierarchical  scheme  are  not  without 
impact  elsewhere.  Problems  encountered  because  of  the  first  two 
disadvantages  noted  in  the  partitioning  scheme,  are  compounded  in 
clustering.  For  the  first  disadvantage,  Kamoun  also  mentions  that  a 
systematic  means  for  clustering  (partitioning)  a network  requires 
further  research.  For  the  second,  if  each  cluster  level  has  reached 
a maximum  threshold  for  a given  network,  the  addition  of  even  one  node 
could  add  a bit  in  the  packet  trace  field  for  each  level  of  the 
hierarchy.  This  extreme  example  is  not  a likely  occurrence,  but  does 
provide  a good  example  of  the  complexity  in  accommodating  minor  topo- 
logical adjustments.  Regardless,  the  research  efforts  of  Neblock  and 
Kamoun  are  bold  and  innovative  towards  developing  non-looping  routing 
algorithms  that  will  accommodate  larger  networks. 


A brief  review  of  the  research  objectives  is  appropriate  prior  to 
discussing  the  objectives  and  results  of  the  simulation.  Chapter  I 
describes  the  objectives  of  this  research  in  broad  general  terms;  that 
is,  to  develop  a distributed  routing  algorithm  which  is  independent  of 
network  size.  Specifically,  the  algorithms  investigated  are  practically 
impervious  to  changes  in  network  topology  with  respect  to  packet  over- 
head and  delay  table  storage. 

Packet  overhead  as  required  to  implement  the  coordinate  addressing 
technique  (CAT)  described  in  Chapter  III  is  minimized.  For  example, 

24  bits  will  accommodate  over  16  million  nodes  with  a one  mile 
separation  where  x and  y are  bounded  by  0 < x,  y <_  4096  in  a (x,  y) 
rectangular  plane.  Eighteen  bits  are  required  to  address  262,144  nodes 
with  10  mile  separation,  x and  y bounded  by  0 £ x,  y 5120.  As  a final 
and  more  practical  example,  12  bits  will  accommodate  4096  nodes  with 
100  mile  separation,  x and  y bounded  by  0 <_  x,  y < 6400.  In  general, 
increasing  the  rectangular  coordinate  field  in  the  packet  header  will 
allow  a decrease  in  distance  of  separation  between  nodes  or  an  increase 
in  the  total  area  of  the  network.  As  an  additional  benefit,  the  non- 
standard packet  destination  field  can  also  be  deleted  since  the 
coordinate  field  provides  sufficient  routing  intelligence. 

Determining  nodal  storage  requirements  for  delay  tables  is  equally 
simple.  Establish  a maximum  number  of  adjacent  nodes,  say  n,  during 
design  (it  need  not  be  conservative).  The  number  of  delay  table  entries 
is  determined  by  mn  where  m is  the  optimum  search  iepth  referred  to 
in  section  4.3.  It  is  demonstrated  later  in  this  chapter  that  a 
search  depth  of  2 provides  consistently  good  results  for  the  three 
network  sizes  investigated.  Assuming  a maximum  of  five  adjacent  nodes 
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to  each  node,  the  delay  table  will  require  32  entries.  This  feature 
combined  with  the  use  of  rectangular  coordinates  to  provide  routing 
direction  is  the  essence  of  the  CAT  which  makes  it  adaptable  to  any 
size  network. 

A disadvantage  of  CAT,  however,  is  the  inability  to  prevent 
looping.  CAT  does  not  have  a packet  trace  field,  and  thus  cannot  have 
a historical  trace  of  nodes  visited  by  the  packet.  Therefore,  loops 
are  not  detectable  at  the  exact  moment  of  formation.  However, 
r.here  are  methods  which  tend  to  diminish  the  continued  occur- 
rence of  loops.  The  following  procedure  is  an  example  of  such 
techniques. 

1.  Establish  a threshold,  t,  on  the  number  of  hops  allowed  before 
a packet  is  identified  as  looping.  One  obvious  value  for  t is  the 
number  of  nodes  in  the  network.  Another,  used  in  this  research,  is 
twice  the  minimum  path  length  between  the  two  farthest  nodes. 

2.  Maintain  a hop  count,  c,  in  the  packet  header.  If  c is 
greater  than  t,  determine  M0D(c,  t). 

3.  Define  a secondary  path  limit,  p.  When  evaluating  for  minimum 
cost  traversal,  discard  the  first  choice,  using  the  second  choice 

path  for  as  long  as  1 £ M0D(c,  t)  £ p. 

The  secondary  path  limit  in  the  above  procedure  was  tested  at  a 
value  equal  to  the  search  depth  in  the  simulator,  and  excellent  results 
were  obtained.  Such  "detect  and  suppress"  anti-looping  procedures 
cannot  cause  trapping  because  no  restrictions  are  placed  on  the  number 
of  times  a packet  may  visit  a node.  The  trapping  phenomena,  described 
later  in  this  chapter,  is  fatal  to  the  efficient  operation  of  a network. 
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The  simulator  was  designed  based  on  the  following  objectives: 

1.  To  determine  an  optimum  search  depth  to  be  used  for  deriving 
minimum  cost  in  node  traversal.  Evaluate  this  search  depth  for  various 
network  sizes. 

2.  To  determine  the  best  of  several  algorithms  which  take  advan- 
tage of  information  derived  from  meeting  objective  1.  The  three 
algorithms  investigated  are  referred  to  as  CATn,  1 <_  n <_  3,  where 
increasing  values  of  n represent  bias  terms,  g(<j>)  equal  to  1,  1/search 
depth  level,  and  1/e  as  defined  in  Chapter  III. 

3.  To  establish  a level  of  confidence  in  data  obtained  from  the 
simulation. 

4.  To  determine  how  the  best  of  the  CAT  algorithms  compare  to 
several  common  earlier  defined  routing  policies. 

5.  To  demonstrate  the  operation  of  a large  scale  network  using 
the  best  evaluated  CATs. 

Remaining  sections  of  this  chapter  elaborate  on  results  of  using 
these  objectives  in  the  simulation. 

5.1  Determining  the  Optimum  Search  Depth 

An  upper  bound  was  necessary  on  the  amount  of  storage  considered 
practical  for  maintaining  delay  tables.  For  that  reason,  the  maximum 
search  depth  investigated  was  limited  to  four,  requiring  a maximum  of 
1024  delay  table  entries  where  each  node  is  allowed  up  to  five  adjacent 
nodes.  The  simplest  of  the  three  CATs  evaluated,  CAT],  was  used  in 
determining  the  optimum  search  depth.  Results  of  the  simulations  to 
determine  the  optimun  search  depth  (SD)  are  given  in  Figures  18  through 
29  (supporting  data  is  given  in  Appendix  B).  Interconnecting 
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lines  in  these  figures  are  only  intended  to  convey  estimates  of  con- 
tinuous functions  for  which  adequate  mathematical  formulae  are  lacking. 
Also,  the  tuning  process  for  each  size  network  was  completely 
independent,  making  comparisons  between  performance  data  from  different 
size  networks  invalid. 

Several  independent  simulations  were  made  at  each  of  the  load 
factors  (RE)  0.2,  0.4,  0.6,  0.8,  and  1.0  (see  tables  in  Appendix  B). 
These  independent  runs  involved  increasing  the  levels  of  the  average 
number  of  messages  delivered  per  node.  Twenty  simulations  were  made 
at  each  of  the  four  search  depths  per  network  configuration.  The 
results  at  each  load  factor  were  then  averaged  with  the  appropriate 
weight  applied  to  each  level  of  delivered  messages.  For  example,  the 
average  packet  hop  entries,  j,  in  Table  Bl-1  were  obtained  by 

j = ( l 5QmnT,  )/(  \ 50mn)  (5.1) 

m=l  1 ,m  m=l 

where  n is  the  number  of  nodes  in  the  network  and  T is  a matrix  repre- 
sentation of  Table  Bl-1.  The  bounds  on  the  95  percent  ^nfidence 
intervals  were  derived  using  the  cumulative  t-distribution: 


.95  = j " t0.95(c-l)s/v^ 

(5.2) 

.95  = j + t0.95(c-l)s//^ 

(5.3) 

where  c is  the  number  of  columns  in  Table  Bl-1,  c - 1 is  the  degrees  of 
freedom,  and  s is  the  standard  deviation  of  the  entries  T,  , 1 > m > 4. 

i >m 

Confidence  intervals  for  queue  lengths  were  derived  using  as  a weight 
the  number  of  queue  length  observations  taken  during  the  corresponding 
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simulations.  The  respective  number  of  observations  is  placed  adjacent 
to  each  average  queue  length  in  the  queue  length  tables. 

A logical  departure  point  for  analyzing  search  depths  is  to 
evaluate  the  average  message  hops  in  traversing  a network.  Figures  18 
through  20  provide  representations  of  average  message  hops  for  each  of 
the  three  evaluated  network  configurations.  Table  Bl-1  through  Bl-3 
contain  the  data  from  which  the  graphs  were  derived.  Corresponding 
search  depths  are  indicated  for  each  curve.  Numbers  directly  above  or 
below  each  data  point  reflect  the  maximum  hops  experienced  by  any  one 
packet  at  the  indicated  effective  data  rate  (EDR).  Where  more  than  one 
number  is  shown  at  a given  EDR,  each  number  corresponds  to  increasingly 
higher  search  depths.  For  example,  tl^p  set  17,  6,  61  describes  the 
maximum  hops  at  search  depths  1,  2,  and  3 for  an  EDR  of  0.6.  The 
average  number  of  hops  per  packet,  j,  is  indicated  along  the  vertical 
axis  and,  as  with  all  graphs,  the  horizontal  axis  represents  the  effec- 
tive data  rate  (load  factor). 

Search  depths  (SD)  1 and  2 both  performed  well  in  the  smaller,  16 
node  network.  SD  2 was  beginning  to  experience  some  fluctuations  at 
the  extreme  high  EDR  as  indicated  by  a maximum  of  12  hops  for  at  least 
one  packet.  In  contrast,  SD  4 maintained  a relatively  high  average  hop 
count  through  all  EDR  levels  while  SD  3 produced  fairly  irrational 
results.  Both  SD  3 and  SD  4 indicate  strong  tendencies  towards  looping 
and  subsequent  network  saturation  at  EDR  = 1.0. 

The  peculiar  fluctuations  of  SD  3 and  the  minor  variation  in  SD  4 
at  EDR  = 0.6  are  contrary  to  anticipations.  They  are  attributed  to  the 
following  factors: 
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1.  For  a 16  node  network,  SDs  3 and  4 approach  the  maximum  length 
of  a minimum  path  across  the  network.  Since  destinations  are  generated 
randomly  only  approximately  one  half  of  the  messages  generated  for 
these  SDs  are  ever  evaluated  for  traversing  more  than  two  levels. 
Therefore,  relative  small  variations  in  average  hop  count,  delay,  etc., 
are  greatly  amplified  when  compared  to  SDs  1 and  2. 

2.  The  relatively  short  queue  length  of  SD  3 at  EDR  = 0.2  allows 
messages  to  be  communicated  through  a node  at  a much  faster  pace.  These 
rapidly  changing  conditions  are  believed  to  be  out  of  phase  with  timing 
of  delay  table  updates.  As  the  load  increases,  maximum  hops  decrease 
with  a corresponding  decrease  in  average  hop  count.  Note  that  although 
queue  length  remains  approximately  constant  between  EDR  = 0.4  and  0.6, 
average  delay  actually  decreases.  A properly  synchronized  delay  table 
update  interval  is  important  to  the  efficient  use  of  network  bandwidth. 

In  the  25  and  36  node  networks,  SDs  2 and  3 produced  similar 
average  hop  counts.  SD  4 approaches  the  behavior  of  SD  1 at  the  higher 
EDR  in  the  25  node  network.  In  the  36  node  network  there  is  a tendency 
toward  looping  by  all  four  SDs  at  the  higher  EDR.  SDs  3 and  4 both 
appear  to  experience  looping  at  lesser  EDRs  than  does  SD  2.  Minor 
variations  are  attributed  to  the  randomness  in  which  destination 
addresses  are  generated. 

Clearly,  SD  1 performed  poorly  for  the  25  and  36  node  networks. 

The  minimum  number  of  hops  for  the  maximum  paths  across  the  two  networks 
is  8 and  10  respectively.  Maximum  hop  counts  as  large  as  those  for  SD 
1 produce  delays  an  order  of  magnitude  greater  than  do  SDs  2,  3,  and  4. 
It  is  very  apparent  that  the  strong  propensity  for  looping  by  SD  1 
produces  the  overall  higher  average  hop  count. 

L 
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The  correlation  between  average  hop  count,  average  queue  length, 
and  average  delay  is,  for  all  practical  purposes,  exact.  It  should  be 
clear  that  greater  hop  counts  will  produce  larger  queue  lengths  result- 


ing in  more  delay  (Figures  21  through  26). 

One  of  the  more  common  measures  of  system  utilization  is  the 
utilization  factor  as  defined  by  the  ratio  of  the  rate  at  which  "work" 
enters  the  system  to  the  maximum  rate  (capacity)  at  which  the  system 
can  perform  the  work  (49).  The  system  is  said  to  be  in  a state  of 
saturation  whenever  the  arriving  work  load  exceeds  the  rate  at  which 
the  work  is  performed. 

The  mean  utilization  factor,  p,  for  the  three  evaluated  networks 
is  illustrated  in  Figures  27  through  29.  Data  points  in  these 
graphs  were  obtained  by  dividing  the  average  arrival  time  into  the 
average  service  time.  Each  data  point  represents  8,000,  12,500,  and 
18,000  observations  for  the  16,  25,  and  36  node  networks  respecti vely. 

A network  is  saturated  or  approaching  saturation  whenever  p = 1.  The 
utilization  factor  is  one  of  the  better  means  of  combining  measurements 
used  to  describe  a network's  operating  state. 

The  relatively  poor  performance  of  SDs  3 and  4 on  the  16  node  network 
is  now  re-examined  using  network  utilization  as  the  measurement  parameter. 
The  poor  performance  is  believed  to  be  a result  of  out-of-phase  delay 
at  levels  greater  than  two.  The  period  between  the  time  of  evaluation 
and  the  time  that  a packet  has  traversed  two  more  levels  is  sufficient 
to  allow  the  previously  evaluated  minimum  path  to  become  loaded.  As  a 
result,  the  network  appears  as  a surface  possessing  local  minima  which 
attract  units  of  work.  The  greater  the  search  depth,  the  greater  the 
attraction  so  that  additional  minima  are  created  as  a result  of  the 
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strong  attraction.  These  extremely  dynamically  fluctuating  conditions 
have  a nullifying  effect  on  weight  applied  to  the  minimum  delay  evalua- 
tion by  the  previously  described  heuristic  function.  The  opposing 
influence  of  evaluating  levels  3 and  4 explains  the  looping  tendency 
at  higher  EDRs  in  larger  networks. 

The  reason  for  the  poor  performance  of  SD  1 on  the  larger  networks 
is  opposite  to  that  for  SDs  3 and  4.  Evaluating  only  one  level  lacks 
sufficient  intelligence  about  conditions  further  down  the  path.  As  a 
result,  many  packets  encounter  heavy  loads  at  the  second  level  which 
cannot  be  anticipated  using  SD  1. 

In  contrast,  if  a network  is  sufficiently  small  that  a greater 
number  of  packets  are  generated  to  nodes  within  two  links  of  the 
originator,  SD  1 will  indicate  a better  performance.  Specifically,  if 
a packet  is  generated  to  an  adjacent  node,  the  route  is  immediately 
fiyed.  If  a packet  destined  for  a node  two  links  away  traverses  the 
first  link  as  a minimum  cost  path,  heavy  loads  down  the  second  link  are 
again  overridden  because  of  the  adjacency  of  the  packet  to  the 
destination.  A 16  node  network  provides  such  an  environment  for  SD  1 
since  more  than  50  percent  of  the  messages  generated  will  be  within  a 
two  link  distance.  The  utilization  performance  on  the  16  node  network 
is  only  slightly  worse  than  the  performance  of  the  best  search  depth. 
The  monotonically  increasing  curvature  of  the  function  connecting  its 
data  points  signifies  little  irregularity  as  increasing  loads  are 
encountered.  As  network  size  increases,  SD  2 excels  in  performance. 

The  superiority  of  SD  2 demonstrates  a willingness  to  recover  from  peak 
loading,  EDR  > 1.0,  at  a much  faster  pace  than  the  other  search 


depths.  It  therefore  was  selected  as  the  optimum  SD  and  used  as  a 
measure  to  compare  other  versions  of  the  CAT. 
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5.2  Evaluating  Other  Coordinate  Address  Techniques  (CATs) 

The  evaluation  process  to  determine  the  best  performing  SD  produced 
counter-intuitive  results.  Delays  evaluated  by  various  SDs  are  little 
more  than  weighting  functions  for  determining  the  optimum  path.  Further 
adjustments  to  SD  2 could  possibly  produce  an  even  better  performance, 
perhaps  SD  1.5  or  some  other  fractional  search  depth.  CATn , as  presented 
in  section  5 was  defined  with  this  consideration. 

Figures  30  through  32  provide  the  results  of  simulating  CATn  on 
the  36  node  network.  For  economical  reasons  and  because  greater  varia- 
tions tend  to  occur  at  higher  load  factors,  the  simulation  was  limited 
to  the  top  three  EDRs  where  greater  spreads  occur.  CAT,  data  were  avail- 
able from  previous  simulations  to  evaluate  the  optimum  search  depth. 

The  inverse  exponential  weighting  factor  produced  the  best  results 
among  the  three  evaluated  CATs.  CAT2,  which  used  an  inverse  SD  level 
weight,  produced  counter-productive  results  which  were  even  less 
efficient  than  SD  1.  The  correlation  of  flows  in  a network  prevents 
the  use  of  analytical  analysis  for  rationalizing  variations  in  each  CAT. 

The  improved  results  obtained  by  using  CAT3  are  attributed  to  some 
relationship  between  the  inverse  exponential  weight  and  the  Poisson 
arrival  distribution  (which  produces  interarrival  times  exponentially 
distributed).  The  poor  oerformance  of  CAT2  is  attributed  to  a weighting 
function,  1/n,  which  is  diametrically  opoosed  to  the  cumulative  delay  of 
searching  n levels.  Step  2 in  section  5 has  been  completed. 
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5.3  Establishing  Confidence  in  the  Simulator 

Law  (52)  describes  a procedure  for  determining  efficient  estimators 
in  simulatpd  queuing  systems.  For  example,  the  efficient  estimator  for 
delay  is  defined  as 

d = Dc/Nc  (5.4) 

where  Dc  and  Nc  are,  respectively,  sample  means  of  the  total  delay  of 
all  customers  served  and  the  number  of  customers  served  in  a busy  cycle. 
Similar  equations  are  provided  for  queue  length,  waiting  time,  and 
others.  The  concept  of  determining  confidence  intervals  with  the  use 
of  efficient  estimators  has  become  an  accepted  practice  for  infinite  or 
even  very  large  populations. 

Estimators  may  be  obtained  from  the  simulations  by  weighting  the 
means  derived  from  each  simulation  with  the  respective  number  of 
observations.  This  procedure  was  used  to  calculate  the  data  points  for 
each  of  the  measurements  described  in  this  chapter.  An  example  using 
queue  lengths  was  discussed  in  section  5. 

Steps  1 and  2 of  section  5 led  to  the  conclusion  that  SD  2 
refined  by  CAT3  is  the  most  optimum  of  the  evaluated  algorithms.  A 
complete  sequence  of  simulations  was  made  using  this  combination  on 
the  36  node  network.  Means  were  then  derived  from  the  resulting 
statistics  from  which  confidence  intervals  were  calculated.  The  95 
percent  confidence  intervals  for  queue  lengths,  delay,  and  utilization 
are  plotted  in  Figures  33  through  35. 

Solid  lines  at  discrete  values  along  the  horizontal  axes  represent 
actual  confidence  intervals  calculated  for  each  load  factor.  Dashed 
lines  are  projected  approximations  for  the  intervals  between  discrete 


confidence  interval  for  utilization  using  CAT 


effective  data 
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EDR  measurements.  Over  640,000  observations  were  made  for  queue 
lengths  to  establish  a 95  percent  confidence  on  the  simulated  data. 
Approximately  18,000  observations  were  made  to  determine  confidence 
intervals  for  delay  and  utilization.  The  tabulated  statistics  are 
contained  in  Table  Bl-14. 

5.4  Comparative  Performance  Evaluation 

In  determining  its  relative  efficiency,  CAT3  was  compared  to  three 
other  algorithms.  Two  of  these,  referred  to  as  the  periodic  update 
algorithm  (PUA)  and  the  global  mapping  algorithm  (GMA)  have  been 
thoroughly  investigated  and  applied  to  several  operational  networks. 

The  third,  called  the  regional  mapping  algorithm  (RMA),  is  proposed  by 
Neblock  (65)  as  a solution  to  looping.  Each  of  these  algorithms 
requires  some  form  of  a delay  table  for  preserving  delay  information. 
The  following  discussion  provides  a functional  description  of  each  and 
how  each  algorithm  compares  to  the  performance  of  CAT3. 

The  PUA,  in  its  basic  form,  mades  no  attempt  to  alleviate  the 
looping  phenomena.  Each  node  builds  a delay  table  containing  antici- 
pated delays  to  all  other  nodes  in  the  network  over  all  outgoing  links. 
The  algorithm  causes  transmission  of  delay  table  updating  vectors  to 
all  nodes  after  some  prespecified  interval  of  time.  The  communications 
of  update  vectors  occur  approximately  synchronously  in  time  for  all 
nodes  (33).  The  PUA  operates  as  a fixed  routing  procedure  until  the 
delay  tables  are  altered.  Table  III  provides  a brief  description  of 
the  algorithm. 
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Table  III.  Periodic  Update  Algorithm 


1.  If  this  is  the  source  node, 
append  the  source  identifier 
and  route  the  message  to  its 
destination  via  the  minimum 
delay  line. 

2.  If  this  is  the  destination 

node: 

a.  remove  the  message, 
else 

b.  append  the  current 
node's  address, 

c.  select  the  minimum 
delay  line  from  the 
delay  table  and  queue 
the  message  for  trans- 
mission then 

d.  go  to  2. 


Obviously,  the  PUA  will  experience  looping  when  delay  table  updates 
are  no  longer  responsive  to  actual  conditions.  This  may  occur  when 
loads  fluctuate  out  of  phase  with  table  updates,  resulting  in  the  GMA 
which  uses  a trace  field  to  maintain  a historical  path  of  a packet. 
Although  loops  cannot  occur  with  the  GMA  (described  in  Table  IV) 
messages  can  become  trapped  when  all  adjacent  nodes  have  been  visited 
and  the  destination  has  not  been  reached.  Trapping  causes  many  lost 
messages,  defeating  a basic  philosophical  constraint  that  messages 
accepted  by  a network  must  be  delivered.  As  shown  by  the  following 
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examples  borrowed  from  Neblock,  it  is  equally  damaging  to  both  the  GMA 
and  the  RMA. 

The  phenomena  of  trapping  in  the  GMA  is  best  explained  with  the 
use  of  a diagram  (Figure  36).  Assume  that  a packet  has  been 
generated  at  node  1 destined  for  node  5.  Congestion  within  node  1 
forces  it  to  select  the  port  to  node  2 for  transmission.  The  packet 
then  follows  a path  from  node  1 to  node  2 then  to  node  4.  In  the 
interim,  a new  minimum  delay  vector  has  been  issued  resulting  in  delay 
tables  being  updated  throughout  the  network.  Node  4 subsequently 
identifies  the  port  to  node  3 as  the  path  of  minimum  delay  to  node  5. 
Having  previously  visited  all  adjacent  nodes,  the  packet  becomes  trapped 
upon  being  received  by  node  3. 

Figure  36.  Message  trapping  in  the  GMA 
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Table  IV.  Global  Mapping  Algorithm 


1.  If  this  is  the  source  node, 
turn  the  source  map  bit  on 
and  route  the  message  via 
the  minimum  delay  line. 


2a.  If  this  is  the  destination 
node,  remove  the  message, 
else, 

2b.  turn  the  current  node  map 
bit  on, 

2c.  select  the  minimum  delay 
line  from  the  delay  table; 
retrieve  the  node  map  for 
the  node  terminating  the 
minimum  delay  line;  AND  the 
message  trace  with  the  next 
node  map;  if  the  result  is 
zero,  route  the  message, 
else 

2d.  select  the  next  best  output 
line  from  the  delay  table 
and  return  to  step  2c. 


As  a network  using  the  GMA  increases  in  size,  available  bandwidth 
for  user  text  decreases.  This  is  because  every  node  in  the  network 
requires  its  own  personal  bit  in  every  packet.  The  resulting  overhead 
makes  the  use  of  the  GMA  infeasible  for  medium  and  large  scale  networks 
Neblock  addresses  this  problem  to  some  extent  in  his  investigation  of 
loop  free  algorithms.  The  RMA,  referred  to  as  partitioning,  provides 
a partial  solution  to  the  diminishing  bandwidth  problem  for  medium 
scale  networks.  However,  it,  too,  may  become  a victim  of  the  trapping 
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phenomena  as  explained  in  the  following  paragraph.  Table  V contains 
a functional  description  of  the  RMA. 

Figure  37  represents  an  arbitrary  network  whose  partition  2 
possesses  connectivity  to  adjacent  partitions  as  shown.  Assume  that  a 
packet  originated  at  node  2 and  traverses  the  path  2-4-5«8-7-6-3.  It 
has  become  trapped  at  node  3. 

Figure  37.  Trapping  with  the  RMA 
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Table  V.  Regional  Mapping  Algorithm  (65) 


la.  If  this  is  the  source  vertex 
and  it  is  an  interior  node, 
set  the  region  bits  to  the 
current  area,  turn  the  source 
map  bit  on  and  route  the 
message  to  the  node  on  the 
minimum  delay  path  to  the 
destination,  else 

lb.  if  this  is  the  source  node, 
as  well  as  a boundary  node 
bordering  a region  not  pre- 
viously traversed,  route 
the  message  to  the  next  re- 
gion; else  route  the  message 
to  the  node  on  the  minimum 
delay  path  within  the  source 
region. 

2a.  If  this  is  the  destination 
node  remove  the  message, 
else 

2b.  if  this  is  a boundary  node 
to  the  last  region  traversed, 
set  the  region  bits  to  the 
current  area  and  turn  on  the 
node  map  bit,  else 

2c.  if  this  is  a boundary  node 
to  the  next  region  to  be 
traversed,  route  the  message 
to  the  next  region,  else 

2d.  select  the  minimum  delay 
line  from  the  delay  table, 

AND  the  trace  map  with  the 
next  node  map;  if  the  next 
node  has  not  been  previously 
traversed,  transmit  the 
message,  else 

2e.  select  the  next  best  output 
line  from  the  delay  table 
and  return  to  step  2d. 
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Note  that  step  2b  has  the  effect  of  labeling  all  nodes  in  the  region 
just  exited  as  previously  visited.  The  advantages  and  disadvantages  of 
the  RMA  have  been  provided  earlier.  Reference  is  made  to  Chapters  III 
and  IV  for  detailed  descriptions  of  CAT3. 

Performance  data  extracted  from  reference  65  are  used  to  describe  the 
the  PUA,  GMA,  and  RMA.  Figure  38  compares  the  delay  performances  of 
the  three  algorithms  to  CAT3.  The  simulator  was  reprogrammed  for  50 
kilobit  circuits  using  a 640  millisecond  update  interval  (as  used  by 
Neblock)  so  that  more  valid  comparisons  could  be  made.  The  results  of 
using  9.6  kilobit  circuits  are  also  plotted. 

The  solid  line  curves,  representing  the  simulation  of  the  PUA,  GMA, 
and  RMA,  were  derived  using  a 19  node  network  configured  similar  to  an 
early  version  of  the  ARPA  network.  The  50  kilobit  CAT3  data  were 
obtained  using  the  25  node  distributed  network.  Assuming  no  topological 
irregularities,  performance  of  CAT3  would  tend  to  be  somewhat  better  on 
the  19  node  ARPA  network  since  the  minimum  path  between  maximally 
separated  nodes  has  only  four  links  versus  the  eight  links  for  the  25 
node  network.  The  similar  performance  of  CAT3  to  the  RMA  and  GMA  serves 
to  strengthen  the  confidence  assumed  for  the  simulator. 

5.5  Results  of  Large  Scale  Network  Simulation 

Extreme  care  was  taken  in  tuning  the  simulator  for  large  networks. 
Specifically,  the  simulator  was  tuned  for  256  nodes  to  provide  approxi- 
mately the  same  utilization  measurements  at  load  factor  0.2  as  obtained 
when  simulating  the  36  node  network.  This  was  necessary  for  two  reasons: 
1)  to  provide  for  estimates  of  required  (real)  resources  and  2)  to  allow 
projections  on  the  validity  of  the  256  node  data  when  compared  to  the 
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36  node  data.  The  results  of  each  simulated  load  factor,  averaging  87 
minutes  of  execution  each,  were  carefully  analyzed  before  proceeding  to 
the  next  load  factor.  A final  run  at  load  factor  1.5  was  made  after  it 
was  determined  that  the  larger  network  could  accommodate  higher  load 
factors  before  reaching  saturation.  The  following  discussion  is  an 
aggregate  of  the  analysis  on  all  six  load  factors.  Table  Bl-16  contains 
the  data  that  supports  this  analysis  (Figures  39  through  42). 

The  sharp  incline  for  the  average  number  of  hops  (Figure  39)  between 
LF  = 0.2  and  LF  = 0.4  is  attributed  to  the  fixed  routing  character- 
istics of  the  network  under  lightly  loaded  conditions.  At  load  factor 
0.2  the  average  delivery  time  is  computed  to  be  2.75  seconds  (Figure  41) 
while  the  average  interarrival  period  is  determined  to  be  20  seconds. 
Clearly,  most  messages  arriving  at  any  one  node  are  already  delivered 
before  succeeding  messages  arrive.  Average  queue  lengths  are  relatively 
short  and  therefore  have  little  influence  on  the  delay  anticipated  along 
a particular  path.  Each  message  traverses  the  network  along  a fixed 
route  computed  by  the  algorithm  (CAT3)  to  have  the  minimum  distance 
between  a given  originator/destination  node  pair. 

The  routing  characteristics  of  CAT3  tend  initially  to  force  messages 
which  must  traverse  greater  distances  toward  fixed  routing.  This  is 
because  of  the  overriding  influence  of  the  distance  factor  when  combined 
with  delays  contained  in  the  delay  table.  As  a message  gets  closer  to 
its  destination,  the  relative  influence  of  the  two  factors  reverse  with 
delay  table  entries  becoming  more  dominant.  This  causes  the  routing  of 
messages  with  short  distances  to  be  more  responsive  to  fluctuating  load 


conditions.  The  conclusion  is  that  as  messages  get  closer  to  their 
destinations,  a greater  emphasis  is  placed  on  "time  to  delivery"  and 
less  on  "distance  to  travel". 

The  analysis  indicates  a trend  of  decreasing  average  hops  per  message 
for  load  factors  greater  than  0.4.  Average  queue  lengths  continue  to 
increase  at  a linear  rate.  The  same  is  true  for  utilization  which  is 
derived  as  the  ratio  of  average  delay  to  average  interarrival  time. 

The  average  delay  remains  relatively  constant  with  minor  increases  at 
each  higher  load  factor. 

The  length  of  a simulation  is  a direct  function  of  the  total  number 
of  messages  delivered,  i.e.,  the  run  is  terminated  when  the  average 
messages  delivered  per  node  reaches  a prespecified  level.  Therefore, 
as  load  factors  increase,  a greater  percentage  of  messages  delivered 
require  fewer  hops  than  at  lower  load  factors.  Messages  are  arriving 
at  a faster  pace  and  those  with  shorter  distances  to  travel  are  being 
delivered  at  a higher  rate. 

As  anticipated,  average  queue  lengths  increase  at  a linear  rate 
with  increased  load  factors.  A linear  increase  is  expected  until  looping 
occurs  at  which  time  the  network  saturates.  Queue  lengths  then  grow  at 
an  explosive  rate  with  corresponding  growths  in  average  hops,  delay,  and 
utilization. 

Based  on  analysis  of  data  on  the  36  node  network  and  the  rate  of 
utilization  increase  resulting  from  increased  load  factors  on  the  256 
node  network,  saturation  would  have  occurred  at  approximately  LF  = 2.0 
where  utilization  is  expected  to  exceed  1.3.  In  contrast,  a 36  node 
network  will  eventually  saturate  if  it  is  allowed  to  operate  at  LF  = 1.0 


for  extended  periods.  The  difference  is  attributed  to  the  greater 
number  of  routing  alternatives  in  large  networks.  To  emphasize,  the 
dynamic  aspect  of  the  routing  algorithm  affords  larger  networks  with 
a greater  distribution  of  fluctuating  loads.  One  concludes  that  for  a 
given  utilization  starting  value,  larger  networks  maintain  stability 
much  longer  than  smaller  networks  under  increasing  load  conditions. 

In  that  respect,  CAT3  has  demonstrated  its  robustness  to  varying  loads 
in  large  networks. 
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CHAPTER  VI 

CONCLUSIONS  AND  RECOMMENDATIONS 


6.  Conclusions 

The  primary  objective  of  this  research  was  to  investigate  and 
develop  a routing  algorithm  feasible  for  large  scale  networks.  The 
algorithm  must  retain  as  many  of  the  desirable  properties  of  current 
adaptive  routing  policies  as  possible.  Utilization,  as  defined  by 
classical  queuing  theory,  is  presented  as  the  basic  efficiency  metric, 
although  several  other  performance  measurements  were  made. 

Initial  investigations  were  devoted  to  minimizing  the  memory 
impact  for  maintaining  delay  tables.  For  modeling  purposes,  a network 
is  described  as  a set  of  trees  so  that  delay  information  at  each  node 
need  only  be  maintained  to  the  farthest  leaf  in  the  tree.  The  size 
of  the  tree  was  evaluated  for  optimum  depth.  This  portion  of  the 
research  proved  to  be  beyond  intuitive  expectations  since,  contrary  to 
that  perceived,  a search  depth  of  two  was  concluded  to  be  optimum.  Not 
only  is  the  execution  overhead  for  selecting  a minimum  path  signifi- 
cantly decreased  over  a search  depth  of  3 or  4,  but  the  table  size  for 
delay  data  is  greatly  reduced  by  a search  depth  of  two. 

Next,  it  was  necessary  to  develop  means  for  evaluating  delay 
between  the  leaves  of  the  tree  and  the  destination.  A heuristic 
measure  was  applied  because  actual  delay  is  unknown  beyond  the  leaves. 
The  only  environmental  descriptor  of  any  given  destination  available 
to  an  originator  was  determined  to  be  a function  of  the  address  field 
and  the  physical  coordinates  identifying  the  destination.  As  a result, 


network  topology  was  configured  for  use  with  the  rectangular  coordinate 
system.  The  distance  measure  applied  was  the  sum  of  the  absolute 
differences  between  the  respective  vertical  and  horizontal  components 
of  the  plane. 

Various  weights  were  applied  to  evaluate  a search  tree,  having 
determined  an  optimum  tree  depth  and  supporting  heuristic.  This  was 
done  because  fine  tuning  search  depth  two  produced  an  even  better 
performance.  An  inverted  exponential  bias  proved  to  be  the  best  of  the 
weights  evaluated.  Final  analysis  indicated  that  consistently  good 
performance  was  obtained  with  a search  depth  of  two  biased  with  an 
inverted  exponential  weight  and  increased  by  a heuristic  representation 
of  the  remaining  delay.  The  resulting  algorithm  was  the  third  variation 
of  the  coordinate  addressing  techniques  considered  and  therefore  named 
CAT  3. 

Although  the  algorithms  evaluated  create  noteworthy  savings  in 
nodal  storage  and  link  bandwidth,  looping  remains  a potential  problem. 
The  few  loops  produced  by  search  depth  two  occurred  at  very  high  load 
factors,  normally  above  day-to-day  operating  loads.  However,  momentary 
peaks  of  high  loading  will  tend  to  produce  some  loops  wi th  the  additional 
delay  resulting  in  dissatisfied  customers.  The  "detect  and  suppress" 
methods  employed  in  this  research  to  decrease  looping  do  not  prevent 
loops.  A perfect  loop  prevention  mechanism,  if  one  exists  for  the  CAT, 
has  not  been  found. 
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6.1  Recommendations 

I 

A considerable  amount  of  resources  wa3  necessary  to  collect 
simulated  statistics  for  this  research.  Economic  considerations  pro- 
hibit more  than  a demonstration  that  CAT 3 will  perform  well  on  very 
large  networks.  The  costs  of  an  in-depth  performance  evaluation  on 
large  scale  networks  is  astronomical  in  terms  of  execution  overhead. 

Further  research  is  necessary  to  devise  methods  for  better  control 
on  conditions  which  tend  to  produce  looping.  More  effective  techniques 
should  be  developed  that  would  allow  an  algorithm  using  a heuristic 
measure  of  delay  to  perform  without  loops.  As  a practical  tradeoff, 
perhaps  there  exists  a technique  between  the  partitioning  scheme  and  the 
coordinate  addressing  technique  which  may  provide  a solution  to  the 
looping  phenomenon. 

Finally,  a unified  theory  for  large  scale  network  design  is 
necessary  which  incorporates  the  experience  gained  from  previous 
research.  Such  a project  could: 

1.  Investigate  and  define  parameters  that  characterize  workload 
and  performance  of  large  networks. 

2.  Develop  a standard  theory,  symbolism,  or  language  to  describe 
the  various  events  and  functions  in  large  scale  network. 

3.  Develop  a theory  of  design  for  large  scale  networks. 

The  evolution  of  large  data  networks  offers  many  unusual  and 


challenging  problems  as  an  advancing  technology. 
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APPENDIX  A 


The  following  is  a mathematical  description  of  the  Markovian  birth- 
death  process  for  the  M/M/C  system.  Standard  Kendall  notation  is 
used  for  arrival  distribution/service  distribution/servers  where  M 
implies  Poisson  and  exponential  distributions  for  arrivals  and  service 
respectively. 

Let  Pn(t)  = Pr{n  in  the  system  at  time  t}  and  h represent  some  small 
increment  in  time.  Also  let  the  arrival  rate  be  x = XrvVn  and  the  service 


rate  be 

nn  (1  in  < c) 

an  = ■ 

cu  (n  1 c). 

Then  the  differential -difference  equations  are: 

Pn(t  + h)  = Pn(t)(l  - »nh)(l  - u„h)  + Pn_1(*n_1h)0  - Vlh> 

* Vi<’  - Vlh)(Vlh) 

and 

po ( t + h)  = po ( t ) ( 1 - x0h)  + po(l  - xih)(a1h). 

Therefore 

pn(t  + h)  - p (t) 

-s H— * P„(t)[x„  * wn  - x„„h] 

+ P - (t) (x  , - X ..u  .h) 

pn  n-1  n+1  n-1  ' 


+ Pn  + (t)(an+1  - VlVlh) 


(Al-1 ) 


(Al-2) 


(Al-3) 


Po ( t + h)  - n0 (t) 


= -do (t)x  + pi(m  - xUlh). 


(Al-4) 
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The  differential-difference  equations  are  now  determined  as: 
Pn(t  + h)  - Pn(t) 

F ■ -pn<t)(>n  + * p„-l(t>Vl 


pn+l(t),,n+l 


and 


p0(t  + h)  - p0(t) 

lim  r = -p0 (t) x0  + p j ( t )p j 

h->0  n 


such  that 


•pn(xn  * "n>  * xn-lpn-l  + Vl'n.!  ' 0 


and 


■*oPo  + w 1 P 1 = 0‘ 

Pn+1  and  P]  are  determined  to  be 
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pn+l  p 


X p a i 

n n _ n-  I ^ 
Pn  - Pr 


n+1 


n wn+l  ""“I 


and 


xo 

Pi  =-Po- 


Substituting  Al-8  into  Al-7, 


p2  = 


Xi  + pj 


Pi  + — P0  = 


X0Xi 


V2  ’ * U2  u UiP2 

Substituting  Al-9  into  Al-7, 


PO- 


P3 - 


X0Xi x2 
U1U2W3 


PO' 


By  induction 


Xoxl*  • 

Pn  = -TTT — Po  - Po  n 


n xi-l 


n u 1 u2  • • • w. 


1=1 


(Al-5) 

(Al-6) 

(Al-7) 

(Al-8) 

(Al-9) 

(Al-10) 


(Al-11) 
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Substituting  Al-1  into  Al-11, 


pn  = 


n!p 


n Po 


n 


c"-cc!„"  P° 


(1  <_  n < c) 
(n  > c) 


(A1-12) 


Using  the  fact  that  J p = 1 , p0  can  be  determined  by 

h=0  n 

c-1  .n  °°  n 

Pot  l I nJ  - * '• 

n=0  n!pn  n=c  cn  cc!p 


(Al-13) 


For  simplicity  of  notation,  let  r = x/p  and  p = r/c  = x/cp.  Equation 
Al-13  becomes 


c-1  n - n 

* L^7]  = k (A1‘14) 


p»y0 


n=c  c c! 

The  latter  simulation  series  in  equation  Al-14  is  simplified  as 
n „c  » 

l 

n=c 
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w II  V- 

l l (|)r 

n=c  c c!  — 


c QO 

c-  l 
m=0  c 


(A  1 - 1 5 ) 


1 

c!  (1  - r/c) 

rrxm 


, (r/c  = P < 1), 


recognizing  that  l (~)  is  a geometric  series  with  the  sum  q— - 
m=0 

By  substitution. 


- rCi’  rl  * crc  .-1 
Po  n=g  n!  c!(c  - rr 


(r/c  = 


1) 


= rS1  — (-in  + — (-ic( — ^ — ir1 
Lni0  "!  V c!  V - x;j 


(Al-16) 


Measures  of  effectiveness  are  now  derivable  having  determined  the  prob- 
ability of  zero  messages  in  the  system  during  steady  state. 
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The  first  measure  of  effectiveness  of  concern  is  the  expected 
queue  size,  Lg.  By  definition,  the  total  length  of  queues  is  equivalent 
to  the  number  in  the  system  reduced  by  the  number  of  servers.  Therefore, 


Lg  = l (n  - c)pn 
n=c 


v n „n_  r t _n_ 

A n_c  r Po  A n_c  r Po- 

n=c  cn  cc!  n=c  cn  c! 
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Again  using  the  sum  of  a geometric  series. 
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(Al-18) 


Applying  Little's  formula  (31)  the  expected  waiting  time  is 


wg  ■ ^ * [ — ■'-‘/liLi! .]  Po> 

(c  - l)!(cM  - x)2 


( A 1 - 1 9 ) 


the  waiting  time  in  the  system  is 


W - 1+  [ (>/|,)  V ■■  -]  p„. 

" (c  - l)!(c„  - x)2 

and  finally,  the  expected  number  in  the  system  is 
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Though  waiting  time  distributions  are  useful  in  some  applications,  their 
use  for  network  analysis  is  limited.  Reference  is  made  to  Gross  and 
Harris  (37)  for  their  derivation. 


r 
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APPENDIX  B 

The  data  contained  in  Appendix  B supports  the  graphical  presentations 
shown  in  Chapter  V.  Headings  and  titles  make  each  table  self  explanatory 
and  therefore  little  additional  comment  is  necessary. 


Table  Bl-1.  Sixteen  node  network  max  hop/average  hop  data. 
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Table  Bl-5.  Twenty-five  node  network  queue  length/observation  data. 


153 


95 

COOO  OiCOr- 
O cn  fOr- 

O^fCO^r- 

oi  CO  cc  CM  VC 

cr>  co  co  lo 

oo  i — vc  n o» 
(T>  CT>  LT>  *3" 

= 1 

CM  CM  CM  CO  LT> 

CM  CM  CM  CO 

1 — CM  CM  CO  ^ 

1 — CM  CNJ  CO  LO 

oi 

CO  ^ CM 

O <T>  CO  CO 

cr>cO/—CMf^ 
CT>  CO  CO  CT> 

CO  LO  CM  VO  CO 
O'*  CO  CO  *3"  CM 

C^COCOCO^- 

oo  co  co  id  co 

CM  CM  CM  CO  ^ 

r—  CM  CM  CO  CO 

f—  CM  CM  co 

r—  CM  CM  CO  LD 

95 

1 't  cc  ocm 

| O^COOIr- 

co  ir>  co  r^v  cm 
oi  co  rv  co  co 

co  on  co 
co  rv  ro  o- 

co  vo  .—  oo  oo 
oo  co  co  sr  #— 

-j  1 

| CO  CM  CM  ro  ^ 

f — CM  CM  co  CO 

1—  CM  CM  co  CO 

r—  CM  CM  CO  LO 

Table  Bl-6.  Thirty-six  node  network  queue  length/observation  data. 
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Table  Bl-7.  Sixteen  node  network  average  delay  data. 
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Table  Bl-9.  Thirty-six  node  network  average  delay  data. 
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Table  Bl-10.  Sixteen  node  network  utilization  data. 
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Table  Bl-11.  Twenty-five  node  network  utilization  data. 
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Table  Bl-13.  Supporting  data  for  three  algorithms  evaluated. 
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Table  Bl-16.  256  node  network  data. 
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Appendix  C contains  a listing  of  the  network  simulator  program 
used  in  support  of  research  for  this  dissertation.  The  user's  guide 
preceeding  the  expanded  variable  definition  list  and  listing  explains 
the  various  options  available  and  the  type  and  format  of  the  input 
parameters.  The  program  is  sufficiently  well  documented  that  those 
versed  in  FORTRAN  should  experience  no  difficulty  interpreting  the 
event  sequence  within  the  program. 
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1 . User' s Guide 

a.  Description  of  Input  Parameters 
(1)  Simulator  Input  Requirements 

(a)  NUMNOD  - Establishes  the  number  of  nodes  to  be 
simulated.  NUMNOD  must  meet  the  following 
restrictions:  NUMNOD  = N2,  2 <_  N <_  32. 

(b)  MINLK  - Determines  the  minimum  number  of  links 
possessed  by  each  node. 

(c)  MAXLK  - Determines  the  maximum  number  of  links 
allowed  each  node. 

(d)  NBUF  - Establishes  the  maximum  number  of  buffers 
at  each  node. 

(e)  MSGLVL  - Establishes  the  maximum  number  of  messages 
allowed  active  in  the  network  at  one  time.  The 
appropriate  value  of  MSGLVL  is  derived  as  a function 
of  n where  NSET(n)  is  the  GASP  event  and  entity 
storage  array. 

(f)  MAXSPL  - Establishes  the  number  of  messages  to  be 
delivered  for  a given  simulation  run. 

(g)  LKAHD  - Determines  the  search  depth  value  to  be 
simulated.  LKAHD  is  constrained  by  1 £ LKAHD  <_  4; 
all  values  greater  than  4 are  converted  to  4. 

(h)  NDIGIT  - A load  factor  synchronization  parameter. 
Adjust  NDIGIT  (used  in  subrouting  OTPUT)  during 
simulator  tuning  to  give  an  approximate  0.2  measured 
load  factor  for  an  input  RE  = 0.2.  Subsequent  runs 


with  increasingly  higher  RE's  should  result  in 
measured  load  factors  approximating  RE. 
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(i)  NOPRT  - A print/no  print  flag  for  printing  nodal 
coordinates,  adjacency  matrix,  density  matrix,  etc. 
NOPRT  <0:  do  not  print 
NOPRT  > 0:  print 

{j)  ITEM  - A threshold  for  generating  interim  output. 

For  a given  ITEM  value,  say  I,  statistics  will  be 
printed  for  every  I message  generated  and  for  every 
I messages  delivered. 

(k)  IROUT  - A flag  to  prevent  generation  of  minimum 
delay  vectors. 

IROUT  = 0:  do  not  generate  MDV's 
IROUT  / 0:  generate  MDV's 

(l)  NRFM  - A hop  inhibit  threshold.  The  hop  suppress 
mechanism  will  be  implemented  for  LKAHD  hops  whenever 
hops  = 2*NRFM*SQRT ( NUMNOD ) . 

(m)  RE  - A load  factor  starting  parameter. 

(n)  REINC  - An  increment  value  for  RE  when  executing  two 
or  more  runs. 

(o)  UPTINT  - Establishes  the  update  interval  for  gener- 
ating update  vectors. 

(p)  XLN  - Determines  the  user  packet  length. 

(q)  BIAS  - A means  of  weighting  the  effects  of  the 
heuristic  function  on  the  evaluation  of  total  delay. 


(r)  DNSIPT  - A network  loading  variable  for  varying  the 
interval  of  message  generation.  The  following 


values  have  been  realized  at  RE  = 0.2  using  the 
simulator  random  number  generator. 
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DNSIPT 

Avq  msq  interval  (in  secs) 

0.004 

20.69 

0.010 

10.05 

0.015 

5.85 

0.020 

2.79 

0.048 

2.02 

0.065 

1.64 

0.095 

1.36 

(2)  GASP  Input  Requirements  - Different  GASP  input  is  required 
for  each  network  size  to  be  simulated.  A detailed 
description  of  GASP  input  parameters  is  provided  by 
Pritzker  (68)  on  pages  67-80. 

b.  Format  for  Simulator  Input:  Three  types  of  input  are  possible 
depending  upon  the  simulation  options  desired.  The  first  two 
cards  are  always  required. 

(1)  Format  (12 (2X , 18) ) : for  inputting  all  integer  values 
described  in  paragraph  a(l)  above. 

(2)  Format  ( 6 ( 2X , F8 . 3 ) ) : for  inputting  all  real  values 
described  in  paragraph  a(l)  above. 

(3)  Format  (2(F5.2,2X) ,15) : Used  for  additional  runs 
involving  re-initialization  of  values  for  RE,  REINC,  and 
MAXSPL.  This  input  is  placed  at  the  end  of  the  GASP 
input  parameters.  It  should  be  noted  that  the  appropriate 
parameter  in  GASP  (NNRNS)  must  reflect  the  total  number  of 
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runs  desired  including  those  with  re-initialized  simulator 
pa ramete rs. 

c.  Simulation  Options:  Table  Cl - 1 . 

2.  COMMON  Variables  in  the  Simulator 

The  following  variable  definitions  are  provided  for  ease  of  under- 
standing program  flow.  Variable  definitions  are  normally  contained  in 
comments  to  the  program  listing;  however,  in  this  particular  case  the 
number  of  variables  to  be  defined  warrant  a separate  composite  reference. 
Each  of  the  variables  is  expressed  in  the  indicated  user  COMMON  statements. 

The  definitions  identify  the  routine  of  primary  utilization  as  well 
as  the  purpose  of  each  variable.  No  variable  is  used  for  a dual  purpose. 


a. 


C0MM0N/UC0M1  / 

(1)  ADJMAT  - adjacency  matrix;  used  throughout  the  program, 
however,  its  elements  are  generated  in  the  driver. 

Column  1 contains  the  numeric  node  designation,  columns 
2-5  indicate  adjacent  node  connectivity,  column  6 contains 
the  number  of  adjacent  nodes  for  the  node  identified  in 
column  1,  and  column  7 is  a running  total  of  links. 

(2)  TRFMAT  - traffic  matrix;  generated  in  the  driver  and  used 
throughout  the  simulator.  TRFMAT  reflects  the  traffic 
density  between  each  node  and  their  adjacent  nodes.  It  is 
used  in  INTLC  to  generate  the  mean  interarrival  rate  for 
each  node. 

(3)  CAPMAT  - capacity  matrix;  generated  in  the  driver  to 


reflect  the  capacity  of  each  link  in  the  simulated  network. 
Although  links  are  set  to  9.6  kbps  in  the  driver,  different 
capacities  are  possible  with  minor  adjustments. 


Table  Cl-1.  Simulation  options. 
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1 

(4)  XMU  - mean  interarrival  rate  array;  generated  in  INTLC  by 
averaging  the  traffic  densities  of  connected  links  obtained 
from  TRFMAT. 

(5)  ITRACE  - unused;  originally  intended  as  a message  trace 
switch,  however,  storage  requirements  become  excessive. 

(6)  DP  - unused. 

(7)  XCORD  - x coordinate  array;  generated  in  the  driver  and 
used  ir.  RTALGO  to  evaluate  the  weighted  distance  to  the 
destination  node. 

(8)  YCORD  - y coordinate  array;  same  as  XCORD. 

(9)  Z - random  number  variable;  used  throughout  the  program. 

(10)  BSYLIN  - busy  line  array;  reflects  the  status  (busy/not 
busy)  of  the  respective  interconnecting  line.  BSYLIN  is 
used  in  several  routines  where  availability  of  resources 
must  be  evaluated  before  releasing  messages  for  trans- 
mission. 

(11)  MATHAG  - used  in  RTAt GO  and  PATHAG.  MATHAG  is  an  array  of 
delay  data  to  be  compared  for  selecting  the  next  (adjacent) 
node.  It  contains  only  the  results  of  the  cost  evaluations 
to  adjacent  nodes  in  the  current  tree. 

(12)  LSORC  - an  array  of  cross  references  between  source  nodes 
and  link  numbers.  LSORC  is  used  in  ENCODE  and  LINDEL  to 
identify  the  most  recent  source  of  a message. 

(13)  LDEST  - an  array  of  cross  references  of  link  numbers  to 
destinations.  LDEST  values  provide  indices  to  GASP  for 
filing  messages  queued  for  transmission.  It  is  used  in 
ENCODE  and  LINDEL  for  obtaining  destination  node  numbers. 


(14)  DISMAT  - distance  matrix;  contains  the  distance  from 
nodes  to  their  adjacently  connected  nodes.  DISMAT  is 
established  in  the  driver  and  used  in  GENMDV  to  reflect 
propagation  delay  in  the  minimum  delay  vector  being 
generated. 

(15)  JQUE  - input  queue  array;  used  in  several  routines  to 
tabulate  a count  of  messages  awaiting  for  process  by  a 
node. 

(16)  IQUE  - output  queue  array;  used  in  several  routines  to 
tabulate  a count  of  messages  awaiting  transmission  over 
an  outgoing  link. 

(17)  NUMLIN  - number  of  lines;  a summation  of  ADJMAT(I,6) 
reflecting  the  total  number  of  lines  in  the  network. 

NUMLIN  was  originally  used  as  a bound  for  testing  that 
an  erroneous  line  number  was  being  used.  Its  use  was 
later  discontinued  to  conserve  execution  time. 

(18)  KSTOR  - a temporary  storage  of  the  first  index  into  MATHAG 
identifying  the  next  node  of  least  cost.  KSTOR  is  used 

in  RTALGO  to  pass  the  index  back  to  CPUSVC. 

(19)  DELMAT  - delay  matrix;  used  in  LINDEL  to  preserve  the 
delay  data  contained  in  the  minimum  delay  vector  and  in 
PATHAG  to  evaluate  for  minimum  cost  paths. 

(20)  MSGEVT  - message  event;  used  by  GASP  as  an  event  filing 


parameter.  MSGEVT  is  subsequently  used  by  EVNTS  to 
identify  the  type  of  event  occurring.  Its  value  is 
initialized  in  the  driver. 
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(21)  JSTIMP  - just  an  imp  event;  same  as  MSGEVT  except  for  imp 
inputs. 

(22)  IMPEVT  - imp  event;  same  as  MSGEVT  except  for  imp  trans- 
missions. 

(23)  NFNMTO  - negative  request  for  next  message;  same  as  MSGEVT. 

(24)  LEVENT  - L event;  same  as  MSGEVT  for  identifying  a minimum 
delay  vector  generation  event. 

(25)  NCKTO  - acknowledge  event;  same  as  MSGEVT. 

(26)  KMITVT  - transmission  event;  same  as  MSGEVT. 

(27)  INCDVT  - incident  event;  same  as  MSGEVT  to  identify  and 
file  erred  messages  for  retransmission. 

(28)  NODQUE  - internal  node  queue  files;  same  as  MSGEVT. 

(29)  IMAT  - integer  matrix;  used  in  earlier  simulator  testing. 
Its  use  was  discontinued. 

(30)  IMATPT  - same  as  IMAT. 

(31)  IVALS  - same  as  IMAT. 

(32)  GMAT  - same  as  IMAT. 

(33)  ITB  - buffer  array;  used  by  several  routines  to  tabulate 
buffer  utilization.  ITB  contents  are  printed  in  OTPUT. 

(34)  LINMAT  - line  matrix;  a matrix  of  line  numbers  between 
consecutively  numbered  nodes  and  adjacent  nodes.  LINMAT 
is  generated  in  the  driver  and  used  in  several  routines  as 
a cross-reference  for  re-identifying  the  immediate  source 
node. 

(35)  XYLN  - average  link  transmission  interval;  initialized  in 
the  driver  to  minimize  execution  time  and  used  throughout 


the  simulator. 
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(36)  XZLN  - average  line  transmission  interval;  same  as  XYLN. 

(37)  LKPT  - same  as  IMAT. 

(38)  LINBSY  - line  busy;  a variable  for  accumulating  the  number 
of  times  a needed  line  was  found  busy.  LINBSY  is  altered 
in  several  routines  and  printed  in  subroutine  OTPUT. 

(39)  NOBUFR  - no  buffer;  a variable  for  accumulating  the  number 
of  times  a message  was  refused  service  because  of  insuf- 
ficient node  buffer  space.  Same  as  LINBSY. 

(40)  ICPBSY  - CPU  busy;  used  to  accumulate  the  number  of  times 
service  was  refused  because  a node  CPU  was  busy.  Same  as 
LINBSY. 

b.  C0MM0N/UC0M2/ 

(1)  RELIM  - use  discontinued;  originally  used  in  MSG  as  a 
stopping  parameter  identified  as  the  load  factor  limit. 

(2)  FIRST  - a flag  initialized  to  zero  in  the  driver  and 
incremented  in  INTLC  to  reflect  the  number  of  simulation 
runs  made. 

(3)  RESAVE  - a temporary  storage  variable  for  preserving  the 
load  factor.  Its  use  was  discontinued. 

(4)  UDTLIM  - use  discontinued. 

c.  C0MM0N/UC0M4/I0UT  - messages  delivered;  IOUT  is  compared  to 

MAXSPL  in  MSG  and  used  as  a stopping  parameter  which  accumulates 

the  number  of  messages  delivered. 


3.  Listing  of  Simulator 
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